[
Lists Home |
Date Index |
Thread Index
]
At 6:32 PM -0400 4/13/04, Bob Wyman wrote:
>Elliotte Rusty Harold wrote:
>> It does not depend on whether *you* are working
>> with a schema. It depends on whether *everyone*
>> is working with the *same* schema. The assumption
>> that there is a single unique schema, which everyone
>> agrees to and adheres to is a common fallacy in the XML world.
>
> I see your point, in the abstract, however, I can't help
>thinking that we should be able to assume more uniformity than you
>suggest. For instance, if an XML document declares a namespace that is
>defined by a normative schema, I think that processors of the document
>should be able to assume that the rules defined by that schema
>actually apply to the document. Or, if someone creates a document that
>claims to conform to some specification that includes a normative
>schema, we should be able to assume that the rules of the schema apply
>in instances of the format.
Not necessarily. Practically, I routinely publish documents that
reference the XHTML DTD and yet are not valid. So far. it hasn't
bothered anybody. Occasionally I screw up well-formedness, and then
the system notices and complains. Sometimes even a human notices and
complains. Lack of validity doesn't seem to bother anyone though.
But let's look at a more common case. Are 001, +1, 1.0, and 1.0000
the same? All having a schema that types these as xsd:float or
xsd:decimal really says is that in one interpretation they are. That
is, they have the same value space. But this is not the only possible
value space. In other contexts the difference may be significant. For
instance, some compilers might read 001 as an octal, +1 as a decimal
int literal, and 1.0 and 1.000 and doubles. In scientific publishing,
1.0 and 1.0000 are understood to be +/-5 in the last decimal place so
the second is much more-precise than the first. This can be very
important.
String application of a schema as you seem to be advocating limits
everyone to treating the data as the same type. However, we may have
different needs and want to interpret it in different ways. I may use
a Java BigDecimal where you use an IEEE-754 double, which could give
us very different round-off issues, for instance. We can both conform
to the same schema while still not accepting the same interpretaion
of the data.
> If we can't make these assumptions then it seems to me that
>XML document exchange would always require bi-lateral agreements
>between authors and consumers. The alternative would be unilateral
>decisions made by both producers and consumers that could result in
>widely variant interpretations of the data -- i.e. the value of any
>particular XML document would be akin to the tea leaves in a fortune
>teller's cup -- any relationship between reality and what is read
>would be anecdotal at best. This might work well with poetry, but I
>think that such interpretational freedom is probably inappropriate in
>many areas of XML usage.
How the interpretations vary depends on what different parties want
to do with the data. If we're both doing pretty much the same thing,
we'll have similar interpretations. If we're doing very different
things, we may have very different interpretations. For instance, if
I'm writing amazon's accounting system, I might read the price of a
book in an XML document they publish as a money type with exact
decimal arithmetic. If I'm a publisher checking on the average price
of my books at several bookstores, I may want to use a double instead
for ease of arithmetic, and because I don't need to be precise to the
penny. And if I'm an author republishing the amazon information about
my books in real time on my web site, I may want to treat it as a
string that's just copied from one place to another. The
interpretation of the data is always a function of the local
environment and understanding. Schemas can be descriptive of one
local understanding, but they cannot force everyone to agree to the
same interpretation of the same data. The simple fact is different
people and organizations do have different needs that require
different understandings of the same data.
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
|