OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[xml-dev] Re: XInclude vs SAX vs validation

When merging infoset streams with DTD items, there are
a few additional items that come up (at least with SAX).

- #IMPLIED attribute values.  The infoset expects
  them to be reported, but SAX does that implicitly
  by the same DeclHandler.attributeDecl() call that
  couldn't be issued early enough to flag attributes
  in included text that are of type NOTATION,
  ENTITY, or ENTITIES (as you implied).

- As attributes of type NOTATION, there's the
  [notation] property on PI info items; streaming
  can't merge DTDHandler.notationDecl() in a
  legal position.

- And let's not forget two properties on the
  "Unexpanded Entity Reference" info item:
  [system identifier] and [public identifier].

The basic problem seems to be that DTD info
doesn't merge as neatly as the other stuff, since
it's got to go first.  And it could also be in flat
conflict with what's already been declared;
it's the "flat conflict" that's a real issue (say, with
#FIXED values for xmlns attributes).

I think what you meant to say about DOM L2 is
that it doesn't support attribute typing information.
It certainly includes notations and entities, but not
in a way that the functionality could've been useful.

- Dave

----- Original Message -----
From: "Elliotte Rusty Harold" <elharo@metalab.unc.edu>
To: <xml-dev@lists.xml.org>
Cc: "David Brownell" <david-b@pacbell.net>
Sent: Saturday, September 01, 2001 8:12 AM
Subject: Re: XInclude vs SAX vs validation

> At 9:29 AM -0700 8/21/01, David Brownell wrote:
> >It's interesting that XInclude is specified as "infoset merging", which
> >is a model that's very much attuned to SAX processing.  If only it
> >didn't use XPointer/XPath, thereby precluding pure stream-based
> >processing models!
> >
> I found another place where stream based processing is a problem. The result  infoset has to
include the unparsed entities and notation information items from the included document. These would
need to included in a DTD referenced from the DOCTYPE declaration. This is naturally emitted at the
start of processing. However, you don't know the complete list of these things until processing is
finished. I suppose you could put them in a separate document loaded as an external DTD subset, and
not emit this DTD fragment until processing is finished. However, then the emitted document could
not be parsed by another parser until inclusion was complete.
> Interestingly, this is even more of a problem for DOM than SAX. DOM2 does not identify notations
or unparsed entities. DOM Level 3 core doesn't either, although you might be able to hack it
together using Abstract Schemas.
> --
> +-----------------------+------------------------+-------------------+
> | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
> +-----------------------+------------------------+-------------------+
> |          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
> |              http://www.ibiblio.org/xml/books/bible2/              |
> |   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
> +----------------------------------+---------------------------------+
> |  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
> |  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
> +----------------------------------+---------------------------------+