OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Picking the Tools -- Marrying processing models to data models

Al Snell wrote:

> Why's that different from any other way of transporting data? A comma seperate
> values file is capable of realisation as a 2D array, an SQL-esque table, a stream
> of data points logfile-style, a matrix in the mathematical sense, a raster image,
> ...

The difference is fundamental. There are two base forms for serial data interchange
formats:  one based on structural relationships, the other based on name-value
tuples. XML descends from both, which can lead to the sort of misunderstandings we
are seeing in this thread between observers who see mostly one set of characteristics
and those who see mostly the other. You use the comma-delimited form in your example,
which skews your argument toward the characteristics of the 'structural' form. Yes,
the comma-delimited file may be instantiated as an array, a matrix, a raster image,
etc., but every one of those cases demands a prerequisite structural realization of
the file format. The comma-delimited form is only one small step abstracted from the
fixed-field form--the use of an explicit delimiter permitting variable-length fields.
Yes, a structurally-complex object may be instantiated from this flat interchange
form, but only through a transform which begins from the file creator's structure. In
practice, that amounts to a two-step instantiation--first identifying the fields by
their sequential (structural) relationships, and then once those fields are restored
to the form in which their creator last understood them, transforming them into the
structure required by this particular consumer.

With interchange files of the name-value form, the identity of fields in the minds of
their creator is the given. The transform required in this case is name-based, rather
than object- (or structure-) based--returning to the theme which brought me into this
thread, when Martin Gudgin declared that having XPath be type based rather than name
based would be fantastic. Where the data is name-based, the transform which provides
its instantiation at each interested local node can ignore the received data
structure. Just as the 'structural' form requires the structural identification of
the data, the 'name' form requires its nominative identification.  That
identification might be effected through a standard data vocabulary shared by sender
and receiver, or through a pre-arranged transform specific to the particular
combination of sender and name (i.e. a namespaced name), or may be done through the
application of rules local and potentially unique to this receiving node (again, a
subject I have touched on earlier in this thread,
http://lists.xml.org/archives/xml-dev/200105/msg00736.html, when Gavin Thomas Nicol
described Schematron as determining datatype based on testing conformance of an
instance to a set of assertions). Having identified received data by name, through
whatever mechanism, the data-consuming node may then proceed with whatever name-based
transform of that data is appropriate to its own purposes.

The larger point, to which I feel obliged to return, is this:  the instantiation of
any XML instance to a form locally usable by a processing node neither occurs by
magic nor is inherent in the form or content of that instance; a *process* is always
required. In a sufficiently-controlled environment, where the sender and receiver of
the XML instance share structural assumptions about the form in which this particular
data should be realized, instantiation by datatype or by other non-textual
characteristics of the data is feasible. However, as the differences between sender
and receiver increase, it is commensurately the textual nature of XML which allows
for an abstraction of the instance sufficient to bridge the differences in detailed
interpretation of the data between the two parties. 'Order', or 'price' or even
'transaction' are abstractions, but if I can identify that your data is an
'order'--albeit of your form, which is alien to me--I may be able to instantiate as
an 'order' of the form I can process.


Walter Perry