OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] nice program for converting PDF to XML

[ Lists Home | Date Index | Thread Index ]

Hi Bob,
I worked with this program as well. While it is very useful it leaves some to be desired-- especially in terms of embedded fonts. Additionally, it seems that the original developers have moved on. But for me it was absolutely indispensable. In a project I did I combined this with ghostscripter to generate a png of the background of a printed document-- then wrote a frame/line detect algorithm to get layouts. I combined all of it with a custom XSLT to generate very simple SVG (lines and rects instead of paths)-- all in all I agree very useful-- many thanks to the developers...
Can you describe a little bit more what you did with the results?
All the best,
Jeff Rafter
----- Original Message -----
Sent: Monday, October 04, 2004 8:03 AM
Subject: [xml-dev] nice program for converting PDF to XML

After fooling around with several shareware programs that only convert the first few pages of an Acrobat file, or only work for a few days, I found an open source utility on sourceforge that worked so nicely that I wanted to give it wider publicity among people who might find it handy. See http://pdftohtml.sourceforge.net and http://sourceforge.net/projects/pdftohtml. A Windows binary is available.
Bob DuCharme   www.snee.com/bob       <bob@ 
snee.com> weblog on linking-related topics:


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS