OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] Stupid Question (was RE: [xml-dev] XML doesn't deserve its

[ Lists Home | Date Index | Thread Index ]

>-----Message d'origine-----
>De : Thomas B. Passin [mailto:tpassin@comcast.net]
>Envoyé : mercredi 6 mars 2002 14:38
>À : xml-dev@lists.xml.org
>Objet : Re: [xml-dev] Stupid Question (was RE: [xml-dev] XML doesn't
>deserve its "X".)
>[Nicolas LEHUEN]
>> Like you said, there is very little that the PSVI can add that the
>> application doesn't. Right. But you just suppose that the 
>data received by
>> an application is in the very precise format it was built 
>for. Now, to
>> back to Eric's sample, if I want to add some new nifty tags to my
>> so that a new part of the application can use it while old 
>parts still
>> remain compatible, I have to find a way to convey some 
>meta-data that make
>> the old parts not only understand that the new document can still be
>> processed, but how to build an old-style view of it.
>Two things seem to be perennially mixed up together in this kind of
>discussion.  There are two extremes when you think about how 
>schemas might
>be used.  First, the schema can be used statically ahead of 
>time, that is,
>when you design your code.  This is the case in play when 
>people talk about
>the code knowing the datatypes and therefore not needing PSVI 
>information at
>The second case is the purely dynamic case, whereby nothing is known
>beforehand and everything has to be deduced about the document 
>while it is
>being processed.  When you knew that the document was entirely 
>text, the DTD
>usually served for this.  Most HTML processing is the first case.
>In real life, applications fall inbetween, though I suspect 
>that most tend
>towards the static case.

I'm not asking everything to be dynamic. I try to stay as pragmatic as
possible. Most people write code with static assumption. Their code is
statically bound to a particular schema. Then the schema is extended, and
the code has to be modified to be bound to the new version of the schema.

To me, extensibility is about finding ways to write programs so that the
amount of work following a schema evolution is null or as small as possible.
It's not about magically understanding data and processing it the way it has
to be. I don't think a pure dynamic approach is feasable.

I do think, however, that type inheritance and polymorphism (which is
equivalent to dynamic processing of data depending on its type) are the kind
of concepts that can be used to reduce the costs of extensibility.

>> Extensibility is not about being able to change something in 
>a document
>> structure, then modify all code that rely on this structure. 
>It is about
>> finding a way to tell an heterogeneous mix of code that a 
>single data item
>> can be viewed and processed in a way that fits it.
>Again, two things have been mixed together.  One whether your existing
>processing will break if a document suddenly sprouts new elements and
>attrbutes.  Usually it won't, or the code can be arranged so it won't
>because new elements and attributes can just be ignored.

You can't ignore new elements and attributes like that. Sometimes, it can't
be very wrong to do so (see my crappy example with the 'just-kidding'
attribute). Knowing if you can safely ignore elements and attributes can be
done at design time (after all, nobody forces you to send bleeding edge
documents to legacy code), or try to enforce this in the runtime. But the
general rule "ignore new elements" is a dangerous one, even if it perfectly
suits HTML.  

>Another is whether the xml (the data) can say how it is supposed to be
>>It is about being able to
>> mix some new code that uses the new document structure with 
>some old code
>> that use the old document structure.
>At face value, this seems to ask that a document say how (parts of ) it
>should be processed, perhaps with the help of a schema.  That's a tall
>order.  I think the best that could be done in most cases is 
>for a document
>to point to some other document that would contain some 
>description of the
>intended processing.  If the data language (like XML) 
>supported doing that,
>it certainly could be said to be "extensible", and I think 
>that such could
>be done in XML (to the extent, anyway, that you can describe process by

I don't want the data to describe how it can be processed ! I'm not crazy
enough to hope that this can be easily done...

I just want the schema to describe itself relatevily to another already
known schema, and dynamically provide compatibility rules to legacy
application, so that they can read data in extended schemata as if they were
in the former schema, or forbid any usage of the document if it would be
harmful. Those compatibility rules could be expressed in various ways, from
views (à la AF), transformations, or a type system with inheritance and
polymorphism, I don't know which is the best. What I notice however is that
OOP provides solutions for extensibility, so that it may be interesting to
have a close look at extensibility patterns in OOP before trying to solve
the problem in XML.

>So I don't see the issue of processing after a change to the 
>design as being
>any instrinsic barrier to extensibility.

It is not a barrier, it is a requirement...

>It's only that the 
>would be achieved in a different way from what many people are 
>used to.  But
>heck, if you want the old way, you can already serialize java 
>objects or
>lisp code.

I don't understand what you mean, here. We are asking ourselves what the 'X'
in XML is for. Obviously, there is a lack of guidelines and processing
models to achieve extensibility. Why should this extensibility be radically
different from what we have in, say, ASN.1 (I'm not an expert, but it seems
a common practice to have some "reserved" conditional blocks to provide
extension points, that is to say extensions are foreseen when defining
structures, they are not inserted a posteriori) ? What is the "old way"
you're talking about ? Is it OOP ?

>We were served well by going to more datacentric approaches in 
>the database
>world - a relational database does not say how its tables and 
>data elements
>are to be processed once they are retrieved.  That's because 
>data (and its
>design) is usually more stable over a long time than the 
>processing needs.
>An example is being asked for new report types from old data.  
>Focusing more
>on the data aspect (the markup approach) is fully in line with 
>this trend.
>But it may be less "efficient" than using carefully crafted 
>procedural or
>object code, when you think about any one processing run.

That's an interesting point. When does data becomes code ? Is type
information a piece of data, or a piece of code ? Should we fear types
because they are bound to code, so they are unstable, or use them as a
simple piece of meta-data ?


>For example, a well-designed hierarchical database is 
>generally faster than
>a relational database if the data's processing is well known 
>and the data
>structure remains unchanged.  One challenge with using markup 
>is to figure
>out how to get the benefits of hierarchical processing AND the 
>advantages of
>a data-centric approach.  This can't be done using purely procedural
>Too much rambling...
>Tom P
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>To subscribe or unsubscribe from this list use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS