xml-dev - What is XML For?

What is XML For?

[ Lists Home | Date Index | Thread Index ]

To: tblanchard@mac.com
Subject: What is XML For?
From: Paul Prescod <paul@prescod.net>
Date: Wed, 23 Oct 2002 12:32:54 -0700
Cc: xml-dev <xml-dev@lists.xml.org>
References: <CD4A2C2C-E65F-11D6-9DF2-0030657E2F34@mac.com>
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.1) Gecko/20020826

tblanchard@mac.com wrote:
> 
> On Tuesday, October 22, 2002, at 11:15  PM, Paul Prescod wrote:
> 
>> No, the structure of the request and the response is generally 
>> completely unrelated to the underlying structure of the data store. In 
>> fact, the structure of the request and the response is ideally an 
>> international standard like SVG, RSS, HRML, ebXML etc. This structure 
>> is unrelated to the databases you use to structure it.
> 
> 
> Can't be true.
> 
> The request must name elements in the underlying data store and 
> enumerate at least part of the their relationship and optionally specify 
> the format of the output. You both expose underlying store structure in 
> a request and specify.
> 
> SELECT t1.first_name, t1.last_name, t2.country_name FROM t1 user, t2 
> locale WHERE t1.locale_code = t2.locale_code;
> 
> If I see enough of these queries I can fully deduce the underlying 
> structure of the data store. 

You keep talking about queries but I am not. I'm talking about XML. A 
_request_ is submitted using either pure HTTP or HTTP+XML. That request 
is expressed in terms of business objects, not elements or attributes or 
anything to do with the physical representation of the data (either in a 
database or on the wire). Here's the canonical (silly) example:

-->
<getStockQuote company="MSFT"/>

<--
<quote>15</quote>

Now tell me, is there a relational or object database behind that? What 
protocol is the server using to keep in contact with the stock exchange? 
What do you know other than that there exists a compnay called "MSFT" 
that can answer stock quotes.

(as an aside, I would _never_ do a stock quote that way for a variety of 
reasons that are beyond the scope of this discussion)

 > ...  They are more than related - they are
> tightly coupled.   XSL transformation rules are similar.  I can deduce 
> much of the structure of the underlying XML document from the XSL style 
> sheet. 

The XML document is a _representation_ of the underlying data. It could 
be (and usually is) generated by a Turing complete process. You know 
nothing about the underlying data except what the process chooses to 
show you.

 > ...
> Now, if you want to argue that the XML being described is removed from 
> the underlying datastore that produced it.  Sure.  Two transformations 
> produce some flexibility but any one transformation is brittle with 
> respect to the datastore on either side of it.

Obviously. What's your point? You need an output for a transformation. 
An XML document is a very convenient output. A SQL view is a very 
inconvenient output. You asked about the difference between SQL and XML 
and there's your answer.

>...
> So you want an interchange format.

Bingo! That's what XML is.

> ...  There's nothing wrong with that.  
> I'll go so far as to say that a universal object interchange format is a 
> good thing.  A way to serialize object networks that is useful for data 
> interchange is a good thing.

You are presuming that people want to interchange objects. But they 
don't always (or even usually) want to interchange objects. They want to 
interchange data. Some nodes will represent the data as objects. Some 
nodes will represent it as lists of list. Some nodes will represent it 
as hashtables. That's loose binding.

> But XML doesn't look anything like what you would get if you were to 
> serialize a network of objects.  Its quite different.  Worse, the object 
> to XML serialization scheme is open to interpretation in a big way.  (I 
> know attributes vs elements is an oldie - and I don't want to respark 
> debate). The mapping between XML and say ER models isn't very clean or 
> well defined. 

Right again. That's the beauty of it! When I see an XML document I 
shouldn't know or care whether the node that produced it was thinking in 
terms of ER or objects or lists or hashtables or tuples or ...

> ...  Too many homegrown ideas running around and the mapping 
> problem makes the so-called Object-Database impedance mismatch look 
> minor by comparison.

If you consider impedance mismatches a problem, then yes, XML causes a 
huge problem. If you consider them an opportunity to use the best 
techniques for a particular job, then XML makes more sense. Relational 
databases are kick-ass for holding a kind of information. Objects are 
wonderful for in-memory representation of that information. Objects and 
relations have an impedance mismatch because they are solving different 
problems. XML and objects have an impedance mismatch because they are 
solving different problems.

>...
> This issue has been solved other ways.  Any ER or OO information model 
> can be completely described using arbitrary nestings of Maps 
> (Dictionaries for Smalltalkers), Lists, and Strings.  If you like you 
> can provide a little extra support for automatic coercion for numbers 
> and dates.  There is absolutely no piece of information that can't be 
> represented with this small set of primitives.  NextSteppers call these 
> PLists (property list) and its really easy to rebuild an object/data 
> network from a plist.  In fact, its easy to automate.

Now I know the origin of those horifically silly, dumbed-down data 
structures on my Macintosh harddrive. (I got quite a kick out of them) 
Anyhow, plists are so 1990s. The modern alternative to XML is YAML. You 
should join the mailing list, you'll find many people who feel as you do.

What are the schema, addressing and transformation languages for plists?

Also, consider that plists are typically not shared between many 
different programs. Is there any plist-vocabulary that has _hundreds of 
tools_ around it like XHTML, RSS, SVG or Docbook?

The more vocabularies are shared, the harder it becomes to extend or 
modify them without breaking any particular application. Which is to say 
that in the types of applications XML is designed for, there is _almost 
always_ an impedance mismatch between the _shared representation_ of the 
data and the _in-memory_ representation that any particular application 
uses. This is simply because different programmers solve different 
problems in different ways!

Because plists are typically not shared, there is no need for a schema 
language for them and seldom an impedence mismatch between them and 
their applications. Which is to say, they are not in the same problem 
space as XML.

> I don't see the same thing with XML.  Instead we get the DOM - which 
> doesn't look like any other datamodel used to represent information ever 
> used in the history of information modeling - oh yeah - except for 
> HTML-ish syntax.  

The DOM is an in-memory representation of a serialization format. It is 
not the representation your application would typically use at runtime 
(unless your application is an XML manipulation tool).

> ... But HTML was a hack.  Now the hack is enshrined.

Smalltalk and Nextstep had from 1978 to 1998 to take over the world. 
Somehow they failed but XML succeeded. Perhaps you should think through 
the reasons for that.

>...
> No, that would be card image format (line oriented if you like) ascii.  
> The next one would be RFC 822 messages. 

Yes, XML could be considered a successor to those standards...of course 
it is substantially more powerful in its handling of character sets, 
hierarchy and links.

 >  ...  And XML is not yet universally
> used - its only universally buzzed about.  I have yet to adopt it for 
> anything practical at all because I'm still walking around and around it 
> saying to myself "what could have possibly lead them to THIS?"

You'd be hard pressed to find a computer today that does not ship with 
an XML parser used pretty close to the operating system level. Plists on 
Macintosh. Driver information on modern versions of Windows. Package 
maangement on many Linux platforms. etc.

  Paul Prescod

Follow-Ups:
- Re: [xml-dev] What is XML For?
  - From: tblanchard@mac.com

References:
- Re: [xml-dev] The Browser Wars are Dead! Long Live the Browser Wars!
  - From: tblanchard@mac.com

Prev by Date: RE: [xml-dev] Semantic web?
Next by Date: Re: [xml-dev] XML as "passive data" (Re: [xml-dev] The Browser Wars are Dead! Long Live the Browser Wars!)
Previous by thread: Re: [xml-dev] The Browser Wars are Dead! Long Live the Browser Wars!
Next by thread: Re: [xml-dev] What is XML For?
Index(es):
- Date
- Thread