xml-dev - Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)

Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)

[ Lists Home | Date Index | Thread Index ]

To: "Simon St.Laurent" <simonstl@simonstl.com>
Subject: Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)
From: Alaric B Snell <alaric@alaric-snell.com>
Date: Mon, 10 Nov 2003 00:58:26 +0000
Cc: xml-dev@lists.xml.org
In-reply-to: <r02000200-1028-6B00FC9F126311D88BED0003937A08C2@[192.168.124.11]>
References: <r02000200-1028-6B00FC9F126311D88BED0003937A08C2@[192.168.124.11]>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030704 Debian/1.4-1

Simon St.Laurent wrote:

> I'm thoroughly disappointed that this week's seemingly positive
> developments have led us back to this line of junk.  Maybe it would be
> better for ASN.1 to stay in its corner - I'd hoped it might take some
> burdens off of XML, but this latest conversation seems bent on driving
> that burden deeper into XML processing.

Aw, no... that's nobody's intention.

For a start, there's no shared intent between me and the original 
poster; we've not sat down together in a room and said "Right, how are 
we going to corrupt XML? I know, let's sneak in an optional SAX 
extension! You propose it and I second it, OK?" :-)

I don't know who Bob Wyman is or what his motives are, but my motive in 
picking up the idea of a typed SAX was just "Hmmm, this might be useful; 
I wonder how you could make it backwards compatible", etc.

But note that the people implementing SAX for ASN.1 are, as far as I 
know, implementing pure and simple SAX as you all know and love it. To 
the best of my knowledge, their software will convert every integer, 
date, boolean, or whatnot to nice Java strings and pass them to the 
application. And why do I think this? Because one reason they're 
implementing SAX interfaces is so that they can plug them into existing 
software like XSLT engines and thus use XSLT to convert BER or PER 
information into nice pretty HTML, or XML with a different structure to 
the original information, etc.

You may be right that having it as an optional SAX feature is not the 
way to do this; I've given my reasoning in another post (that an 
interface for push parsers that produced typed values for some or all 
elements/attributes would look a lot like SAX with just a couple of 
methods modified, which seems a duplication of effort).

But if doing it within the existing SAX framework isn't the way to go, 
then fine - it could be done seperately. And then you'd write a wrapper 
class that took a SAX ContentHandler, and implemented the 
TypedContentHandler interface, but told the parser not to provide native 
values and then just passed the unparsed strings straight through to the 
ContentHandler. And you could write a wrapper class that takes a 
TypedContentHandler and implements ContentHandler, that just tells the 
TypedContentHandler that it doesn't know the types of anything so here's 
a string.

Then you could also write a wrapper like that last one, but that takes a 
schema as a parameter, and uses the schema to assign types to elements 
and attributes, parse them, and pass them as native values. It would 
also validate against the schema as it went.

Although the idea started out from a discussion about SAX interfacing to 
ASN.1, a typed push interface would be useful whenever the event source 
happened to know the types of values it was producing:

1) Synthetic event sources, such as database interfaces that present a 
SAX interface so the result of a query can be plumbed directly into an 
XSLT engine to convert to HTML.

2) Parsers for XML with type information (be it via reference to a 
schema embedded in the XML, xsi:type attributes, or a hardcoded 
instruction by the application developer: "I expect things that conform 
to this schema over here!").

3) Parsers for non-XML formats that have type information, such as BER, 
  PER, serialised Java objects, old MS Office file formats, etc.

But note that none of the above *require* a typed push interface; they 
can all very happily and easily output text. It's just that it is 
sometimes a waste to throw away type information if the application 
could benefit from it, just because the API has no way of expressing 
that type information.

ABS

References:
- Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)
  - From: "Simon St.Laurent" <simonstl@simonstl.com>

Prev by Date: Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)
Next by Date: Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)
Previous by thread: Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)
Next by thread: RE: [xml-dev] SAX for Binary Encodings (SAD-SAX)
Index(es):
- Date
- Thread