Supported: PDF, DOCX, XLSX, XLS
Always produces HTML + status JSON. If mixed-content pages are detected, page images are saved for LLM processing.
POST JSON to /finalize with:
run_id — from the /extract responsellm_pages — (optional) array of {"page": N, "html": "..."}output_format — "html", "excel", or "both" (default: "html")Merges LLM-extracted pages into base HTML and/or converts to Excel.
Supported: PDF (layered packaging artwork)