OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Picking the Tools -- Marrying processing models to data models



Al Snell wrote:

> What's that got to do with the textual encoding?

With the *encoding*, very little. With the field or element being identified
by a name, rather than a structure, very much indeed. The transform or other
process which will instantiate the locally useful form of data will in this
case begin from an original form identified by name rather than from an
original form identified by structure or form.

> Image processing programs will typically recognise a wide range of image
> file formats beyond their chosen native one, and some even provide tools for
> the technically advanced to pull data from bizarre image formats (I've seen
> a "raw" import
> dialog that asks how many byte of header to skip, what the pixel format is,
> and how many pixels to read in what order). So if you see an image in a
> format other than the agreed one, it can try to figure it out.

This is usually the other case, where the original form is identified by
structure. Please bear with me if I stress once again how the structure or
datatype of that original form does not govern the form in which data is
instantiated for local processing. The original form is simply an
identification which allows selection of the input side of the transformation
or other process which instantiates locally useful data.

> Are you stating that the textuality and human-readable names on everything
> make it somewhat easier for a human to figure out the format

Human readability is only an ancillary advantage of the
process--identification by name or name-based processing--which I am
describing, and if we have name-based, rather than form-based identification,
figuring out the format has nothing at all to do with it. Quite simply:
instantiating data in a form usable by local processing begins from the
realization that input is available. The immediate question then is, what is
that input? It can be identified either by its label or its form. If the local
process concludes that it is unlabeled, or that its label is incomprehensible,
then, yes, it might be identified by its form. Even when it matches no form
previously known to this process, it may be identifiable by something like the
brute-force method you describe:

> by examining examples, incrementally designing an experimental schema until
> all available samples match that schema flawlessly, then assuming that's the
> format of the data

although the 'all available samples' in this case are simply the identifiable
occurrences of particular structural patterns in the given input data
instance. In any case, the point of input identification is to establish the
'original form' side of the transform or other process which will instantiate
locally usable data in the form locally expected.

With XML there is the possibility, at least, that the input data can be
identified by name, rather than by form. If so, that name, rather than an
identified form, is the basis for the selection of the original form side of
the transform or other process which instantiates data as required by the
local process. Yet that name is no more than an abstract label, which in the
view of the local process is congruent--*via the transform or other process
required to instantiate the local data*--to some equally abstract label by
which the local process identifies its own required input data form:  'order'
as you might use the term in Djakarta to 'order' as I use the term in New
York, for example.

Respectfully,

Walter Perry