OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   re: [xml-dev] SAX characters event and external entities

[ Lists Home | Date Index | Thread Index ]

K. Ari Krupnikov writes:

 > How much of a "violation" would it be to have a caching XMLFilter that
 > would report all contiguous character data in a single event,
 > including across entity boundaries?

It would not be a violation, since readers are not required to provide
a locator at all:

  SAX parsers are strongly encouraged (though not absolutely required)
  to supply a locator: if it does so, it must supply the locator to
  the application by invoking this method before invoking any of the
  other methods in the ContentHandler interface.

Application-specific error reporting would be pretty sucky, but that
might not matter in many cases.

If you did this, though, I'd suggest still putting in a hard-coded
limit.  In fact, as XML gets used in more security-sensitive
environments, we may need to consider putting (very high) limits on
everything to avoid various attacks.

SGML gave limits a bad name because they were so ridiculously low by
default (eight-character names spring to mind), and SGML declarations
were a nightmare to manage in any real-world processing and
interchange situation.  On the other hand, high fixed limits, like
(say) 16K characters for element and attribute names, might help us
avoid some problems in the future.

All the best,


David Megginson, david@megginson.com, http://www.megginson.com/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS