Lists Home |
Date Index |
I'm happy to announce a second release of the configurable Gorille
XML/Unicode character tester. Like the earlier release, it uses XML-based
configuration files to specify which characters should be permitted in
particular XML contexts. This version adds support for both namespaces and
public identifiers, as well as an experimental SAX filter.
Gorille is a small Java package designed to let developers of various kinds
of XML processors test the content and names of XML structures in their XML
documents. While Gorille ships with test files for both XML 1.0 and the
draft XML 1.1, you can create your own configuration files as well.
Gorille uses an XML format to specify lists of characters according to
either XML 1.0 conventions (with its BaseChar, Ideographic, CombiningChar,
Digit, and Extender productions) or XML 1.1 conventions (NameStartChar,
NameChar). Both forms permit specification of the Char and S production for
content characters and whitespace. I've included sample lists for both XML
1.0 and XML 1.1, as well as an ASCII-only version of XML 1.0.
Gorille is now hosted at SourceForge, complete with mailing lists and CVS:
I would especially like to hear from developers who can give Gorille more
thorough testing on a wider range of Unicode than I have been able to do so
far. The SAX Filter in particular needs some tire-kicking, as most SAX
parsers already perform the XML 1.0 version of Gorille's tests, making it
difficult to get large numbers of faulty events into Gorille.
Despite my interests in Unicode and character encoding issues in XML, I
still live in a largely ASCII universe, and no doubt some subtleties have
Contributions, bug reports, and general comments on the usefulness or lack
thereof of this tool are all quite welcome.
Associate Editor, O'Reilly & Associates