OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: SAX2: relative ordering of startDocument() & startDTD() events?

[ Lists Home | Date Index | Thread Index ]
  • From: Ken MacLeod <ken@bitsko.slc.ut.us>
  • To: XML-DEV <xml-dev@xml.org>
  • Date: 29 Feb 2000 13:32:57 -0600

David Brownell <david-b@pacbell.net> writes:

> Ken MacLeod wrote:
> > There's some issues with the processing model though.  There would
> > probably need to be a mode feature where either entityReference()
> > is called or entities are resolved.  More difficult, if a handler
> > wanted to know _both_ the entity reference and the resolved
> > content (which is probably where {start, end}Entity() is
> > proposed).
> Yes, I was thinking that one could go that route.  Basically you'd
> get to control on an entity-by-entity basis whether to expand any
> given entity ref ... which conflicts a bit with the simple true/false
> model of "handles external {parameter,general} entities" flag.

> So while I could see how to make that approach work (callback is a
> predicate saying include/exclude the specified entity), I'm not sure
> it's a good idea to even start walking in that direction, without
> some compelling reason to do so.
> Seemed like you might have some particular use cases to motivate
> such a thing though.

The two cases I can think of are the XML editor (or pass-through
processor) and "ordinary users".  The XML editor wants unexpanded
entities, maybe expanding them later, and the ordinary user doesn't
want to even know about entities.

Expat uses an interesting technique: it doesn't expand external
entities at all, by itself.  The Expat user can specify a handler to
get ext. ent. refs and, then, if it wants the ext. ent. to be included
in the parse it resolves the ent. and creates a sub-parser to generate
more events on it.

I often find myself on both sides.  As a user, I don't care about
entities (I'll not define the handler).  When writing a pass-through
filter I'd like a single entity-reference event.  Now that I've
thought about it this way, I like the Expat model.

Note that internal entity references in attributes are not addressed
at all in the core SAX api (not that they should be, either).  They're
difficult in Expat as well, you have to get the "original string" for
the start tag and reparse it yourself to get the references.  I've
thought about that issue and about the only two ways I can think to
support it is to either have the parser gather the attributes (as it
does now) and pass an attribute value as mixed-content (character data
and ent. ref. nodes) or have {start/end}Attribute events surrounding
character and ent. ref. events.  The former moves a lot of DOM-like
knowledge into the parser and the latter is just plain tedious.

The Perl SAX binding already uses DOM-like event passing, so I expect
for my use I'll have a parser option to pass mixed-content attributes
(in addition to the normalized attribute string).

  -- Ken

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/threads.html


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS