OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parser torture documents



> I'm looking for xml documents that will really stress a parser, in
> particular a SAX2-compliant parser. 

See http://xmlconf.sourceforge.net/xml/?selected=resources
for links to (a) the latest OASIS/NIST test cases, and
(b) a patch fixing about 10% of those files from the latest
official release (3/15/2001) ... some errors snuck in, seems
like some version of ZIP was used in "automangle" mode.

Positive tests alone aren't a very good stress test for parsers.
Instead, use those test cases along with a test harness (like
the one at http://xmlconf.sourceforge.net/java/) and see what
results show up.  (Get the latest out of CVS.)


>  large DTDs (many elements, default attribute values)
>  many non-ascii characters
>  many entity references/PEs

There's one largish DTD there (the XML spec DTD),
as applied to some Japanese translations) with a reasonable
number of PEs; and the new IBM tests cover a lot of the
non-ASCII characters, so that (patched) test suite would
be a good place to start.

- Dave