Lists Home |
Date Index |
At 12:10 AM +0000 11/9/03, Alaric B Snell wrote:
>In some contexts they are; in some they are not. It all depends on
>who is viewing it. If you stop and think, you will realise that an
>API like SAX in no way *forces* 7 to be treated identically to 07.
>Clearly you don't know a thing about how computers work if you think
>that *some* pieces of software treating 7 the same as 07 *if they
>wish to* will somehow emit data-destroying rays that spread across
>the Internet and lop the significant leading zeroes off of telephone
Like many people who want typed data you're confusing the local with
the global. Of course my software will treat the strings in a way I
find useful. However, that in now way means you have to treat them
the same way. You may want floats. I may want ints. Simon may want
strings. There's no one right answer. The underlying premise that
suggests we should exchange typed, binary data is hat there is one
right answer; one type that's better for the data than all the
others; and that simply isn't true. The more complex the data becomes
the less true it is.
>You might be surprised to find that a lot of software *does* take
>character strings out of an XML document and immediately parse them
>as decimal integers. This doesn't appear to have broken XML, does
>it? Shock horror!
Locally, of course not. The problem is when applications start
exchanging their typed binary representations as the one truth rather
than recognizing it as simply one way of seeing the world.
>Yes, the bit where he said that "This SAX option would be
>compulsory, and all XML parsers would be international agreement be
>required to have this option turned on; software that does not turn
>this option on would not be allowed to be sold, or written, because
>I know full well that a character string of '07' at a point in the
>document where the schema says 'this is an integer, so leading
>zeroes in the decimal representation are irrelevant' means that the
>schema author was wrong" really supports your argument, doesn't it?
Did you read what Wyman wrote? He was suggesting that we actually
exchange four bytes containing a big endian two's complement
representation of the number 7 (or some equivalent form), rather than
exchanging the text string 7.
>> All data in an XML document is text, never anything else.
>I can prove you wrong!
>That "10" is text. It's also an integer. It's also the number of
>fingers I have. See? A counter-example.
Not a counter example at all, and when you understand that you will
have achieved the XML nature. 10 is text, not a number. You choose to
interpret that text string as the number ten, which is fine. It's
your choice. Just don't believe for a minute that it's the only
legitimate interpretation of that text string, or that the string and
its interpretation are the same thing.
>> It is certainly not something you'd want to exchange on the
>>Internet, and it is is absolutely not something that should be
>>baked into the core APIs.
>Oh, good! You understand! So what was all that rubbish before about, eh?
>Or are you mistaking an optional thing for something 'baked in'?
Simplicity is a virtue. We're trying to produce a Corvette here, not
an Edsel. Use the right tools for the right tasks. Don't try to make
one API fit all needs.
>Now, you are harping on about those who communicate information
>rather than just opaque text as "polluting" XML, but don't you think
>that demanding that the APIs *they* use be the same as the APIs
>*you* use is... polluting *their* use of XML with *your* model, hmm?
Oh, come on. Now you're being ridiculous. They can invent and use any
APIs (and any formats) they want. The problem is they don't want to
do that. They want to hijack the nice clean SAX API and XML format,
and stuff it full of mismatched garbage I'm going to have to spend my
> I mean, it's not like they're forcing SAX processers to report
>abstract values instead of character strings, is it? If you read
>carefully, you will see that it's an option. And you may have
>noticed that there are lots of SAX processers that aren't doing this
>in use *as we speak*. They won't be influenced by the magical rays
>to change, will they? They will only change if the programmer that's
>using them thinks "Hmmm, I'd like the leaf nodes of this XML parsed
>into abstract values for me to save some coding, I'll use a SAX
>parser that does that for me". If the programmer doesn't want that,
>perhaps because the formatting of dates and integers is important to
>them, then they won't do it.
If you want to translate data into XML to present it through SAX,
fine. But don't start complexifying SAX because you discover your
data isn't a good fit for XML.
>So shock horror! Even when USING this API that produces typed values
>WHEN IT CAN, you could still get at the raw character stream to
>handle it as you always did!
>So exactly how is this going to destroy XML, eh?
1. It will mean people start passing around binary data instead of text.
2. It will make XML so complex that it becomes incredibly difficult
to learn and implement. Soon we'll be back in the SGML hell where no
parser implements everything, and you're never quite sure which
features you can and cannot use. And thus you can no longer safely
interchange XML with other parties. As Simon keeps pointing out,
schemas, XPath 2, and XSLT 2 have already marched a long way down
this road. I don't think it's a coincidence that those of use who
spend the largest part of our time trying to explain and teach these
technologies are most adamant that this is the wrong road to follow.
Elliotte Rusty Harold
Effective XML (Addison-Wesley, 2003)