Lists Home |
Date Index |
Simon St.Laurent wrote:
> I'm thoroughly disappointed that this week's seemingly positive
> developments have led us back to this line of junk. Maybe it would be
> better for ASN.1 to stay in its corner - I'd hoped it might take some
> burdens off of XML, but this latest conversation seems bent on driving
> that burden deeper into XML processing.
Aw, no... that's nobody's intention.
For a start, there's no shared intent between me and the original
poster; we've not sat down together in a room and said "Right, how are
we going to corrupt XML? I know, let's sneak in an optional SAX
extension! You propose it and I second it, OK?" :-)
I don't know who Bob Wyman is or what his motives are, but my motive in
picking up the idea of a typed SAX was just "Hmmm, this might be useful;
I wonder how you could make it backwards compatible", etc.
But note that the people implementing SAX for ASN.1 are, as far as I
know, implementing pure and simple SAX as you all know and love it. To
the best of my knowledge, their software will convert every integer,
date, boolean, or whatnot to nice Java strings and pass them to the
application. And why do I think this? Because one reason they're
implementing SAX interfaces is so that they can plug them into existing
software like XSLT engines and thus use XSLT to convert BER or PER
information into nice pretty HTML, or XML with a different structure to
the original information, etc.
You may be right that having it as an optional SAX feature is not the
way to do this; I've given my reasoning in another post (that an
interface for push parsers that produced typed values for some or all
elements/attributes would look a lot like SAX with just a couple of
methods modified, which seems a duplication of effort).
But if doing it within the existing SAX framework isn't the way to go,
then fine - it could be done seperately. And then you'd write a wrapper
class that took a SAX ContentHandler, and implemented the
TypedContentHandler interface, but told the parser not to provide native
values and then just passed the unparsed strings straight through to the
ContentHandler. And you could write a wrapper class that takes a
TypedContentHandler and implements ContentHandler, that just tells the
TypedContentHandler that it doesn't know the types of anything so here's
Then you could also write a wrapper like that last one, but that takes a
schema as a parameter, and uses the schema to assign types to elements
and attributes, parse them, and pass them as native values. It would
also validate against the schema as it went.
Although the idea started out from a discussion about SAX interfacing to
ASN.1, a typed push interface would be useful whenever the event source
happened to know the types of values it was producing:
1) Synthetic event sources, such as database interfaces that present a
SAX interface so the result of a query can be plumbed directly into an
XSLT engine to convert to HTML.
2) Parsers for XML with type information (be it via reference to a
schema embedded in the XML, xsi:type attributes, or a hardcoded
instruction by the application developer: "I expect things that conform
to this schema over here!").
3) Parsers for non-XML formats that have type information, such as BER,
PER, serialised Java objects, old MS Office file formats, etc.
But note that none of the above *require* a typed push interface; they
can all very happily and easily output text. It's just that it is
sometimes a waste to throw away type information if the application
could benefit from it, just because the API has no way of expressing
that type information.