Lists Home |
Date Index |
Mike Champion wrote:
> I'm sure there's a few thousand people out there who would be happy users
> of pure syntax XML tools such as SAX, [...]
Waitaminnit -- since when is SAX "pure syntax"?
SAX is the quintessential implementation of the XML Infoset!
All non-Infoset syntactic features of the source document
are stripped out; the application only sees (a representation
of) the abstract information items.
> [...] I wish people would just acknowledge that the XML syntax and
> Infoset(s) were joined at birth (every well-formed XML document can be
> parsed into a tree).
Or in other words: there is an abstract syntax behind
the concrete grammar productions in the XML 1.0 REC.
> Then maybe we could do what has to be done to make the
> actual Infoset spec more useful (e.g., by making the language less awkward,
> such as "element" rather than "element information item" [gag]),
The language may be awkward, but the "information item"
qualifier is absolutely necessary in order to distinguish
an "element" (the sequence of characters from the opening '<'
in the start-tag to the closing '>' in the end-tag) from
an "element" (the thing with a name, set of attributes,
and list of children).
> and making
> it as formally rigorous as the syntax spec (somebody said that this could
> be done with ASN.1, but I don't know that).
IMO, the level of rigor in the XML Infoset REC is just
about right. It lists the essential information content
of a parsed document, without constraining the way that
information is represented. ASN.1 would be a gross