[
Lists Home |
Date Index |
Thread Index
]
- From: David Brownell <david-b@pacbell.net>
- To: John Cowan <cowan@locke.ccil.org>
- Date: Mon, 07 Jun 1999 12:51:01 -0700
John Cowan wrote:
>
> David Brownell wrote:
>
> > Hmm, I have a SAX2 driver that parses XML, which I'll release this week.
>
> I suppose you mean "parses HTML".
Yes indeed ... typos abound, so much of the world has taken to
writing XML when they mean HTML! I did so below, too ... ;-)
> > It uses the Swing HTML parser, which is pretty universally available
> > though (like all HTML parsers) it's got quirks with respect to how it
> > handles faulty XML.
>
> That was my first idea, but I learned that the Swing parser doesn't
> do the amount of cleanup I want, so I decided to roll my own.
It's imperfect, but is pretty generally available (and getting moreso).
It works for much, but not all, of the broken HTML in the world. And
at a bare minimum, it's a good lead-in to more sophisticated packages!!
I know they've worked to improve its error recovery, and will do more,
though there are limits to how much broken HTML they'll accept.
> Don Park also has a SAX interface to Swing-HTML, freely available
> but closed source.
I'll have this one under an Open Source (tm) license.
- Dave
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
- References:
- Re: HAX
- From: John Cowan <cowan@locke.ccil.org>
|