OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] How we can convert pdf data into xml?

[ Lists Home | Date Index | Thread Index ]

On Sat, 2006-07-29 at 04:33 +0000, chhaon dhoop wrote:
> Hi!
> 
> Could you please tell me, How we can convert the PDF data into Xml 
> file using java? I found a library PDFBox.

There isn't very much information provided here.

If you are a hardcore programmer and the PDF file is not encrypted then
you can just read the file and look for the text entries and pick those
out.

PDF is basically postscript and not too hard to decode.

Why not just use the PDFBox library and convert the PDF to text using
their capability and then convert to XML from there? You seem to be 90%
of the way there already...

David







 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS