[
Lists Home |
Date Index |
Thread Index
]
At 3:46 AM +0000 11/25/03, Alaric B Snell wrote:
>I just had another idea. Say you wrote an application that read XML
>documents but depended on something *conventionally* ignored -
>perhaps it processes the file significantly differently depending on
>the encoding the file happened to be in when it read it, or
>something like that - or perhaps an XHTML browser that behaved
>significantly differently when an image had width="010" than when an
>image had width="10".
>
>(I've said "significantly differently" to exclude cases like 'A
>generic XML editor tries to save documents back in the same encoding
>it read them in' or 'The XHTML browser shows leading zeroes on width
>attributes when you hit View Source')
>
>Would you not say that those applications are broken for being
>sensitive to such 'irrelevant' syntactic variations?
>
>Thus implying that there *is* an abstract model lurking there?
There's a confusion of layers here.
<http://www.cafeconleche.org/books/effectivexml/chapters/15.html> If
the parser did something significantly different based on two
different encodings (both correctly understood by the parser) that
would be a bug because XML is defined in terms of characters, not
bytes. The conversion of bytes to characters happens below the layer
of syntax. Therefore it's incorrect to depend on it.
The second example with width="010" vs. width="10" would be a
violation the rules of XHTML, but would in no way be a violation of
the rules of XML. XHTML says 010 in a width attribute is the same as
10. XML does not. The violation is in the semantic layer. Different
semantics might treat 010 as different from 10, and be perfectly
correct in doing so. As someone else suggested a few weeks ago a
leading 0 might indicate an octal number in some contexts. Thus 10
might be ten and 010 might be eight. That's not wrong. It's just
different.
Neither of the examples you cite is an irrelevant syntactic
variation. The first is an irrelevant binary variation. The second is
a relevant syntactic variation which becomes an irrelevant semantic
variation when interpreted by one particular application.
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
|