OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HTML --> XHTML --> XML Conversion (HTML Scraping)



Gul Imran asks

> Are there any other tools out there that can perform HTML scraping for XML
> data generation?  Or what logic is best for data extraction after I have
> XHTML Dom Model?
>
There will probably be a lot of responses referring you to Dave Ragget's
Tidy (see the WWW3C web site).  This seems to be the most widely used tool
for scrubbung HTML and producinng XHTML.  Chami's HTML-Kit is a great GUI
tool that uses Tidy.

Cheers,

Tom P