Lists Home |
Date Index |
I'm working on an implementation of XInclude  for Xerces2-J. I've come
across several limitations of stream-based APIs in implementing XInclude:
(1) When attributes with references to unparsed entities or notations are
encountered in included documents, these unparsed entities and notations
must be added to the [unparsed entities] and [notations] properties of the
document information item, as defined in the XML Infoset specification ,
. In SAX, this means sending DTD events. SAX specifies that no DTD
events can be sent after the endDTD/startElement event. However, there is
no way of knowing which unparsed entities and notations must be sent until
after element events start being processed. Thus, we would be unable to
properly update the properties of the document information item.
(2) The XInclude spec allows document fragments to be included using
XPointer  paths. These create lots of problems with stream-based
processing. For instance, an XML document could include a fragment of
itself which has already been processed, and unless the document stream can
be re-opened and reparsed, that information is not available.
At the moment, it is possible to implement XInclude in SAX, as long as
unparsed entities, notations and XPointer fragments are not used. If the
restrictions on "out of order" DTD events are relaxed in SAX, (1) could be
solved. Otherwise, the only way to solve (1) and (2) in a stream-based API
is to buffer all of the events in all documents, so that operations could
be done on the infoset as a whole. I think we can agree that this is not a
pleasant solution, and defeats the purpose of a streaming API.
My team and I thought it would be best to bring this up with the SAX
community, so that the issue of XInclude support with SAX is made public.
Perhaps the restriction on DTD events in the SAX API could be adjusted to
account for the possibility of unparsed entity and notation events occuring
after the end of the DTD events?
How much demand is there for XInclude in the SAX community?
XML Parser Team
Toronto IBM Lab