Re: [xml-dev] XML is text-only ... why?

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: Philippe Poulard <philippe.poulard@sophia.inria.fr>
To: "Costello, Roger L." <costello@mitre.org>
Date: Wed, 26 Sep 2007 15:53:21 +0200

Costello, Roger L. a �crit :
 > [xml-dev] XML is text-only ... why?

That depends on whether you are talking about the markup representation 
(which is definitively a text representation) or about the data model.

In the data model, it is said that instead of processing the litteral 
datas ("23"), an application could process the typed data (the integer 23)
XSLT 2.0 and XQuery are working like this, sometimes thanks to a type 
information given by a schema (xs:integer) and sometimes with implicit 
rules for converting "23" to 23.

The usual way to get it is :
XML -> parse -> validate -> augment -> data model
(the more often in a streaming mode)

An application sensible to PSVI will see the XML items, say an 
attribute, with its text value "23" and bound to it a typed data, say 
the xs:integer 23.

Now consider that the source of the data model is not an XML document 
(those represented with markups), but built by another software 
component ; you can bind in the PSVI other objects that are not 
derivative from text, instead you could bind some binary datas or 
whatever you want ; but what would be the text value ? something 
useless, say, the reference to the object bound to the node ; so, is XML 
still text ? At a high level, not really.

Is it usefull ? In my own experience where I applied this strange 
concept, yes : it is sometimes much more valuable to deal with the XML 
data model not necessary representable with markups than with pure 
textual XML datas :
-because you avoid cost round-trips between the markup representation 
and the data model
-because you can deal with binary objects (such as a JPG image) for 
which a markup representation is irrelevant (although you could get one 
with some base64 encoding, or an hex-to-digit representation)
-because you have to deal with a small amount of datas (and getting all 
the markup would be inefficient)

Example : say that you have the representation of some directory of your 
file system stored in an XML-friendly object ; you could operate on it 
like this :
$dir//*
$dir//*[ends-with(., 'xml')]
$dir//*[@io:size > 1024]
$dir/../some/file

$dir is not represented with markups, you would have to browse the 
entire file system to get it whereas you operate locally on it.

I called this kind of objects X-operable objects (cross-operable 
objects), that is to say objects that are XML-friendly : you can apply 
XPath expressions on them, you can apply XUpdate-like operations on 
them, but they are not necessary representable with markup ; as it is 
still XML, XML is not text only.

All that stuff was presented at Extreme Markup Languages last month, and 
you'll find more infos here (the paper and the PDF presentation) :
http://hal.inria.fr/inria-00173716/fr/

And of course, it is implemented in RefleX !
http://reflex.gforge.inria.fr/

Have the RefleX !

-- 
Cordialement,

               ///
              (. .)
  --------ooO--(_)--Ooo--------
|      Philippe Poulard       |
  -----------------------------
  http://reflex.gforge.inria.fr/
        Have the RefleX !

References:
- XML is text-only ... why?
  - From: "Costello, Roger L." <costello@mitre.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]