OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] SAX for Binary Encodings (preserving investment)(ASN.1 and

[ Lists Home | Date Index | Thread Index ]

At 2:03 PM -0500 11/7/03, Bob Wyman wrote:

>	SAX assumes that the data it is reading is all of "character"
>type. SAX has no access to a schema file and thus has no idea of the
>real type of the characters it reads.

That's because SAX is designed to read XML, not the binary, non-XML 
formats people like to dream up.

>However, a binary encoding will
>typically pass data in a form more appropriate for its type. Thus, an
>integer will be passed as something like a 32-bit value, not a string
>of characters.

Then whatever it is, it certainly isn't XML. XML is text. If you want 
to pass around binary data feel free, but please don't pretend this 
is XML or even a binary encoding of XML. It's not, and you shouldn't 
expect XML tools to work on it.

People keep saying they want a binary encoding of XML but when the 
pedal hits the metal, they keep proposing formats which do not have 
1-1, onto mappings to XML.

>So, to build a "pure SAX" interface to a binary
>encoding, you would have to convert all the binary values to
>characters before passing them to the SAX event handlers.

There are no binary values in XML, period. When I write

<data xsi:type="xsd:float">4.3</data>

I read the string 4.3, which I can then interpret as I want. I am not 
limited to interpreting it as an inexact, 32-bit, IEEE 754 float. I 
can if I want to but I don't have to. Even with a schema THERE IS NO 

>  Of course,
>the first thing a lot of event handlers will do is convert the strings
>back into binary types like integers. The result is, of course, often
>wasteful silliness. It causes performance problems, memory
>fragmentation due to all the string allocations, etc.

The first thing a lot of event handlers will do is convert the 
strings back into binary types like integers *that are appropriate 
for their local data models*. This may be a two byte int, a four-byte 
int, a float, a double, a java.lang.BigInteger or something else. 
There is no one unique and correct answer here. You're trying to 
impose a single representation of the data on all the different 
processes that may wish to process it. That simply won't fly in a 
heterogenous, Internet environment. One size does not fit all.

>	So, while SAX interfaces to binary encodings allow us to do
>things like use XSLT processors on binary data, they also raise some
>issues about the SAX interface itself. Knowing that people are already
>implementing SAX interfaces to binary data, it probably makes sense to
>carefully consider at this point how to handle this problem so that
>standardized solutions can be implemented. That would be much better
>than each developer or vendor coming up with their own interpretation
>of what SAX for binary encodings looks like.

I disagree completely. Please don't pollute SAX with your binary 
data. If you need a binary API for non-XML, binary formats, you're 
free to create one. But if you can't get your binary format and API 
off the ground without misappropriating the SAX and XML brands, then 
I suspect there's something fundamentally wrong with your format.

>	There are a number of possible solutions:
>	1. Insist that all binary types be converted to strings.
>	2. Define an "extended SAX" that passes data with descriptors
>that show the type of data. Thus, a program would be told the type of
>the data being passed and could call a ToString or ToInteger function
>as needed.
>	3. Provide a "mode switch" which could be called to modify the
>behaviour of SAX. If "StringsOnly" was set to true, then only strings
>would be generated, if "allowTypes" was set, then types would be
>passed by descriptor.
>	4. Other options?
>	What makes the most sense here?

5. Stop pretending binary data is XML. It isn't, and it isn't going 
to pass easily into XML tools. Either accept text, or define your own 
formats and APIs. Stop trying to pollute XML.


   Elliotte Rusty Harold
   Effective XML (Addison-Wesley, 2003)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS