[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Enlightenment via avoiding the T-word
- From: ktl@ktlim.com
- To: xml-dev@lists.xml.org
- Date: Mon, 27 Aug 2001 18:51:54 -0700
[Long-time lurker...]
> So the problem isn't that you can't find the appropriate schema
> definition, it's that doing so can be expensive (I agree). Since
> nobody is arguing that it's a bad idea to have context-sensitive
> content models, the problem is ensuring that you only have to look
> up the context - the computationally expensive part - once.
I think there has been some (mild) argument that it is bad
to have context-sensitive content models. I will try to restate the
case.
Why mark up documents or data at all? It's to indicate
meaning that is not present or not obvious in the raw base data.
Why give different labels to the markup? To distinguish between
meanings.
Why then does it make sense to have two pieces of marked-up
data with different meanings and yet the same labels on the markup?
If you, the instance author, see two different meanings for something,
shouldn't you make it easy for others to see those different meanings
as well?
(W. E. Perry will point out that others may see different
meanings even where the author sees only one, but providing for that
doesn't indicate that we should just give up on distinguishing
meanings that the author already knows are different.)
In other, perhaps simpler, words: markup is most useful when
it most specifically labels the base data it is marking up. Why
should we enable or even encourage authors to use non-specific,
difficult-to-reuse labels for their markup?
As an example, the canonical "purchase order" document usually
has separate "<billTo>" and "<shipTo>" addresses (perhaps including
common "<address>" subelements). We could instead use a context-sensitive
definition to say that "address[1]" is always the billing address and
"address[2]" is always the shipping address, but I hope it's obvious
that this is undesirable.
XSDL "local" elements seem to me to come from a world where
companies want absolute control, since they are inherently
non-reusable without their containing "global" elements. In the
OO world, this is considered good, because it hides the data from
manipulation by unknown processes. In the document-centric world
(as I believe Steven R. Newcomb sees it), these unknown (often because
they will be developed in the future) processes are the whole reason
to mark up your base data.
Namespaces allow us to safely invent new ulabels and to reuse
others' ulabels to specifically mark up our data so that it can be
processed in as many different ways as possible. Local elements and
other context-sensitive "features" reduce specificity and reusability.
[Managed to avoid the T word...]
--
Kian-Tat Lim, ktl@ktlim.com, UTF-7: +Z5de+pBU-