OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] SAX for Binary Encodings (SAD-SAX)

[ Lists Home | Date Index | Thread Index ]

Elliotte Rusty Harold wrote:
> At 11:22 AM -0500 11/8/03, Bob Wyman wrote:
> 
> 
>>     public interface TypedContentHandler extends ContentHandler {
>>       public void values(java.lang.Object value)
>>          throws SAXException;
>>     }
> 
> 
> 
> This won't work. There are numerous problems including:
> 
> 1. People want primitive data types often instead of objects.
> 2. It's type unsafe. OBject is not suitable replacement for 
> Date/Integer/int/etc.
> 3. What do you pass when the data doesn't match the schema?
> 

"This won't work." You say. So if somebody produced working code that 
did this would you eat humble pie?

1. ((Integer)value).intValue (). A slightly ungainly expression, yes, 
but bear in mind that the next version of Java, IIRC, will have 
automatic conversion between primitive values and their 'boxed' forms.

And anyway, that's just a wrinkle in Java's type system. The C version 
of the API would probably pass in a discriminated union of int, long, 
etc, from which you would indeed pull out the primitive value.

And you're saying that passing in objects when people want primitive 
values is bad - but you condone passing in a CHARACTER STRING when 
people want primitive values? So they have to parse it rather than just 
unboxing it?!?!

2. "It's type unsafe", crows the guy who wants to interchange all 
information typelessly as strings! Hahaha! That made my day.

Yes, if your code doesn't actually match the schema, or there's just a 
bug in your state management so your code is confused about the context 
the data item appeared in, then you may well get passed an Integer when 
you were incorrectly expecting a Date. And do you know what will happen? 
Depending on your code, it will either handle this however it sees fit 
(perhaps the code is not actually tightly bound to any particular 
schema, and just displays arbitrary XML documents; in which case, it 
just dispatches based upon the received type to code that will display 
an object of that type and thus never 'assumes' anything), or it will 
throw a ClassCastException, signalling the fact that there's an internal 
application fault. Sure, it would be nice if this problem could be 
statically determined at compile time by comparing the schema to your 
code, but the SAX API as it stands isn't very amenable to that, since 
the handler can use any of a probably infinite number of ways of 
tracking its internal state, and the compiler will have a hard time 
matching this to type declarations in a schema. And the same problem 
exists just as badly if you pass the value in as a character string all 
the time; the "parseInt (characters)" might well find the string 
actually reads "Yoghurt, please!" and throw a NumberFormatException. And 
again, the caller will either handle this elegantly if there's a way it 
can do that, or just give up because it really needs to have an integer 
value to do its job.

3. Well, we have a few options... It's a matter of what's simplest in 
the API, but we could either fall back to calling the characters 
callback with the string we encountered, or have a special class such as 
"UnparsableStringValue", which wraps a string, and pass in an instance 
of that. Use your imagination! Of course, with PER, which is always a 
validating parser, the presence of malformed input would throw an 
exception in the parser and never invoke your code. One cannot generally 
make sense of malformed PER messages. In the case of malformed BER, you 
might consider passing in an "UnparsableBERValue" which wraps a byte 
string and the tag from the BER value header - and making both 
UnparsableBERValue and UnparsableStringValue subclasses of 
UnparsableValue, if needs be.

But a plain-and-simple-XML-1.0 application would still be able to quite 
happily just have its characters() callback invoked for every bit of 
CDATA and in the document, and just ignore all of this stuff, quite 
happily. Nothing there would need to change.

Nothing would be polluted!

Now look, I don't really think you're stupid; my snotty tone in this and 
the last email is just because I really don't think you're being a 
useful contributor to this discussion, and that annoys me... you're just 
saying "No, it'll never work, because some people use XML in a way that 
won't need this stuff. I see XML as just text, so everyone else must 
too". Lots of people are already using XML in not-just-text ways. This 
hasn't broken plain old XML yet; you can ignore all the schemas and 
everything and pass text with pointy brackets around to your heart's 
content. Nobody will stop you; nobody wants to stop you. If you're not 
interested in typed information, then treat it as you would treat a 
discussion on any other topic that's not relevant to you - ignore it. If 
you are interested, then can you do better than sitting there saying 
"That's not how I want to use XML. So you can't either!"...

We are not your enemies! We're just engineers and scientists, trying to 
improve the technological state of the art for the betterment of 
humanity, yes? :-)

ABS





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS