Lists Home |
Date Index |
Elliotte Rusty Harold wrote:
> Like many people who want typed data you're confusing the local with the
> global. Of course my software will treat the strings in a way I find
> useful. However, that in now way means you have to treat them the same
> way. You may want floats. I may want ints. Simon may want strings.
> There's no one right answer.
> The underlying premise that suggests we
> should exchange typed, binary data is hat there is one right answer; one
> type that's better for the data than all the others;
...and here I disagree. But this is orthogonal; the SAX extension
proposal is purely a local processing issue, anyway. It affects nothing
globally; the same bits are still exchanged on wires.
> Did you read what Wyman wrote? He was suggesting that we actually
> exchange four bytes containing a big endian two's complement
> representation of the number 7 (or some equivalent form), rather than
> exchanging the text string 7.
Between the SAX parser and the rest of the application, yes. But not
globally. Wyman was originally talking about using SAX for ASN.1 stuff
(although I think a typed extension to SAX has wider application than
that), and none of the ASN.1 encodings have arbitrary relationships with
32 bit architectures or anything silly like that. Unless there's a
validity constraint in the 'schema', numbers in ASN.1 encodings are
constrained only by the available storage space on the disk you've put
the file on...
> Not a counter example at all, and when you understand that you will have
> achieved the XML nature. 10 is text, not a number. You choose to
> interpret that text string as the number ten, which is fine. It's your
> choice. Just don't believe for a minute that it's the only legitimate
> interpretation of that text string, or that the string and its
> interpretation are the same thing.
Ten isn't the only legitimate interpretation of "10", no; that's not my
point. My point is that in the context of the numFingers element - which
is defined (within its namespace) as containing the number of fingers
the person in question has, written as a positive decimal integer -
"ten" is the sole intended semantics of the string "10".
[it's just an option]
> Simplicity is a virtue. We're trying to produce a Corvette here, not an
> Edsel. Use the right tools for the right tasks. Don't try to make one
> API fit all needs.
That's the reason why it would be nice to have it as a SAX option.
I mean, the alternative is to have an API that's a direct copy of SAX
apart from the leaf nodes being reported as Java class 'Object' or
whatever rather than as a string - meaning that we now have another API
that can handle the same stuff as SAX, but can do other things too. This
would fragment the community unnecessarily, by just piling more features
into the core rather than having them as extra modules you can plug in.
>> Now, you are harping on about those who communicate information rather
>> than just opaque text as "polluting" XML, but don't you think that
>> demanding that the APIs *they* use be the same as the APIs *you* use
>> is... polluting *their* use of XML with *your* model, hmm?
> Oh, come on. Now you're being ridiculous. They can invent and use any
> APIs (and any formats) they want. The problem is they don't want to do
> that. They want to hijack the nice clean SAX API and XML format, and
> stuff it full of mismatched garbage I'm going to have to spend my time
No they don't! Who's trying to add things to the SAX API or XML format?
Writing a SAX option doesn't mean adding anything to SAX at all. Writing
a SAX option doesn't mean having to add anything; the typed SAX
extension need never be part of the SAX API, since it's an *extension*.
You would use the SAX API to find out if the typed API was available:
...which would throw a SAXNot RecognizedException if the driver you're
using didn't know about typed values.
But if it did support it, you could call reader.setFeature ("...URI
denoting typed SAX extensions..."), thus informing the parser that the
ContentHandler instance provided by the application also supports the
TypedContentHandler interface, which just adds a few methods for typed
data (the value() callback and presumably some replacement for
startElement to handle elements with typed attributes - although the
latter may not be necessary, perhaps part of the extension could be that
the client application is free to cast the instance of class Attributes
passed into startElement to class TypedAttributes which has an extra
method for getting typed attribute values).
At no point does this require changing anything in the SAX API.
>> So exactly how is this going to destroy XML, eh?
> Two ways:
> 1. It will mean people start passing around binary data instead of text.
They already do that... look at all the images on web pages :-)
> 2. It will make XML so complex that it becomes incredibly difficult to
> learn and implement.
Why must XML change? I don't see how this will change XML...
> Soon we'll be back in the SGML hell where no parser
> implements everything, and you're never quite sure which features you
> can and cannot use.
Ahah! That's more like a valid point. Yes, it would be bad if your
application was written to use typed SAX in order to remove the burden
of doing all the parsing of date formats and so on, but then you found
yourself having to compile it on a platform with no SAX parser that
implemented the extension... the solution to this is to have nice open
source implementations that quickly get ported everywhere :-)
> And thus you can no longer safely interchange XML
> with other parties.
This doesn't affect the interchange of XML, however; there's nothing
about typed SAX, as I see it at least, that changes XML in any way.
> As Simon keeps pointing out, schemas, XPath 2, and
> XSLT 2 have already marched a long way down this road. I don't think
> it's a coincidence that those of use who spend the largest part of our
> time trying to explain and teach these technologies are most adamant
> that this is the wrong road to follow.
As I said in my reply to Simon's posting, I don't agree with how XML
Schema, XPath 2, and XSLT have been done myself... I think the W3C has
failed to consider the implications of its actions, in some respects.