You are here

Open Xerox PDF-to-XML API

The service accesses the content of a PDF document and generates structured XML. The site offers an interactive application for converting PDF to either XML or ePub formats. The web service enables programmatic conversion specifically to XML. API methods support detection of header and footer, segmentation and ordering of text found in the PDF file. Methods also detect and process embedded table of contents, captions, and footnotes.