[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Enlightenment via avoiding the T-word

From: Nicolas LEHUEN <nicolas.lehuen@ubicco.com>
To: 'Michael Brennan' <Michael_Brennan@allegis.com>,"'xml-dev@lists.xml.org'" <xml-dev@lists.xml.org>
Date: Wed, 29 Aug 2001 10:00:37 +0200
I understand your point, but for practical use (I, like many others, need to
write code to help my company grow and earn some money), I find that
considering an XML document as an instance of a schema is very useful.

One of the many reasons for it is that when you write some code that
manipulates an XML document (be it for transformation or other purposes),
your code is written for a particular schema.

You can write code that operates on any XML document, the proof is XML
tools, but as soon as your code requires a semantic interpretation of an XML
document, it has to follow a implicit or explicit schema (unless it is
totally nonsense).

From there, you have two possibilities :

1) you hard code the target schema within your code, directly writing
algorithms that are bound to a particular schema.
2) you try to use meta-informations about the document schema to structure
your code in an extensible way, so that if the schema changes you don't have
to rewrite the whole code. I strongly believe that a schema API and/or a
form of PSVI are premium tools to do that.

I don't understand what's inherently bad in the PSVI, except maybe for your
point that "the validated form of a document is not its true form, so the
PSVI should not be considered as an enhanced version of the document
Infoset". I just don't care. I need this information to do my job in the
right way.

Regards,
Nicolas

>-----Message d'origine-----
>De : Michael Brennan [mailto:Michael_Brennan@allegis.com]
>Envoyé : mardi 28 août 2001 21:52
>À : xml-dev@lists.xml.org
>Objet : RE: Enlightenment via avoiding the T-word
>
>
>> From: Don Park [mailto:donpark@docuverse.com]
>
><snip/>
>
>> What we need is a XML processing framework that supports 
>> molecules and atoms
>> design pattern.  What we don't need is more complications and fancy
>> if-but-when-must at what I consider to be lexical slum level. 
>>  *argh* I am
>> starting to sound like Len.
>
>I'd go even further. We need an XML processing framework that 
>accepts the
>plurality of application domains without prejudice or 
>favoritism. We need an
>XML processing framework that does not take a specific 
>metadata vocabulary
>for annotating information items and a particular set of 
>transformations and
>bless them and insist that everyone accept the notion that 
>these are not
>annotations and transformations, but rather the process of 
>realizing an XML
>instance's True Form. The only true form of an instance is that of the
>instance itself, and that's nothing but a bunch of text and 
>pointy brackets.
>Everything else is layered atop that to suit a particular 
>application domain
>or processing model.
>
>I'll put forth a different formulation of some thinking I 
>tried to express
>yesterday. I feel like I've had an epiphany (though that feeling may
>evaporate once someone rhetorically rips me to shreds).
>
>It is very typical for an XML application to want to associate certain
>metadata with XML information items to suit certain processing 
>needs. One
>can easily envision different metadata vocabularies to suit different
>domains. None of these are inherent in the instance itself. None of the
>processing done with the instance and associated metadata is a 
>realization
>of the instance's true form. The only true form of the document is that
>which is in the instance itself, and that's just a bunch of 
>text and pointy
>brackets.
>
>One particular class of application is that which we call a 
>validator. The
>metadata a validator wishes to associate with an information item is a
>grammar or set of rules that express a set of constraints. 
>Validators verify
>that a document satisfies the collective set of constraints 
>associated with
>its information items. If the document satisfies the constraints, the
>validator passes it on to another application for further processing;
>otherwise, the document is rejected. There is no such thing as 
>well-formed
>but inherently invalid XML document. It is only invalid within 
>the context
>of a particular domain, and the constraints suited to that 
>domain can be
>expressed in a schema.
>
>Another class of application is transformers. These produce a different
>information set better suited for processing within a 
>particular application
>domain. At one extreme, we have those that associate XSLT 
>templates with
>elements, and use these to transform a document into something 
>potentially
>quite different. At another extreme we have those that do very simple
>transformations, such as adding default attributes. Then there are many
>shades of gray between these extremes, such as Simon's 
>namespace-mapping SAX
>filters. There is no such thing as a wrong transformation, 
>except one that
>produces an unintended result.
>
>Other applications may wish to annotate information items with 
>additional
>labels that provide hints to applications for further processing. For
>instance, one may want to attach a label to elements 
>"shippingAddress" and
>"billingAddress" that indicate both of these represent 
>addresses and should
>be processed as such by an application. We should not be enshrining one
>metadata vocabulary in a PSVI and insisting that that one 
>vocabulary and no
>other is intrinsic to the true form of document instances.
>
>With the status quo, however, validators are accorded special 
>status. They
>are not like other applications. The processing they do is regarded as
>something very fundamental and inviolate. In addition, validators are
>allowed to add certain annotations from a specific blessed metadata
>vocabulary (XML Schema), and they are allowed to perform 
>certain specific
>blessed transformations. These blessed transformations and 
>annotations are
>considered part of the true form of the instance, enshrined in 
>the PSVI, and
>annotations or transformations that are not blessed are derided as
>desecrations of the infoset. The lines drawn strike me as 
>rather arbitrary.
>
>If we reject the PSVI, and if we accept there are no wrong 
>transformations
>(except those that produce an unintended result), and if we accept that
>there are potentially many metadata vocabularies suited to different
>application domains and many valid processing models, and that 
>none of these
>are somehow intrinsic to a document instance, then it seems to 
>me that much
>of the fodder for argument simply evaporates. One can even 
>imagine a more
>flexible schema mechanism that can be invoked in different 
>modes like an
>XSLT stylesheet. One could invoke it in "elementFormQualified" mode, or
>"elementFormUnqualified" mode; the application gets the 
>information in the
>form most suited to its needs, and we dispense with religious 
>debate over
>which is its true form.
>
>In hindsight, now, I have to agree with Mike Champion's post about
>scholasticism[1]. This debate smacks of scholasticism because 
>it centers so
>much on the debate about the true form of an XML instance. I 
>think the XML
>world needs its renaissance, and we must start by dethroning the PSVI.
>
>[1] http://lists.xml.org/archives/xml-dev/200108/msg01020.html
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this elist use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>
>
Prev by Date: RE: Enlightenment via avoiding the T-word
Next by Date: RE: Enlightenment via avoiding the T-word
Previous by thread: Re: Enlightenment via avoiding the T-word
Next by thread: RE: Enlightenment via avoiding the T-word
Index(es):
- Date
- Thread