OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] PDF to XML

I do a lot of XML conversion work from various sources (Word, Interleaf/Quicksilver, Framemaker, PDF). We have some older publications that exist in PDF only and need to get them into a MIL-STD XML structure. One process I use is to save the PDF to RTF using Acrobat Professional. I then use a tool, "rtfx.exe" to convert the RTF to well-formed XML. The tool is available here:


I've found if the files are table heavy, I get better table structures by saving the RTF to .DOCX then back to RTF again.

Charles Flanders
Senior IETM Developer | Certified XML Developer
BAE Systems | Land & Armaments | York, Pennsylvania

-----Original Message-----
From: Ihe Onwuka [mailto:ihe.onwuka@gmail.com] 
Sent: Thursday, May 01, 2014 9:37 AM
To: xml-dev@lists.xml.org
Subject: [xml-dev] PDF to XML

Is there a  tool/process for such conversions? I suppose I could always do it with Adobe Acrobat Professional (can I?).

Be grateful for commentary from someone who has been there before.


XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS