[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Data Model(s) for XML 1.0 / XML Devcon / DOM / XSL / Query
- From: "W. E. Perry" <wperry@fiduciary.com>
- To: XML DEV <xml-dev@lists.xml.org>
- Date: Fri, 23 Feb 2001 15:39:34 -0500
Robin Cover wrote:
> Hmmm... You wrote:
>
> > By permitting an instance document to stand on
> > its own as syntax, without the expected pre-ordained
> > semantics expressed in a DTD.. XML took the decisive
> > step which SGML never had
>
> I don't understand, unless something is lurking in "expected" and/or in
> "pre-ordained".
What is lurking is the 'pre' in "pre-ordained" (which implies 'expected'). If
nothing is expected, then it is legitimate to consider only the body of the
instance document. If a content model or schema is desirable as a means to
describe the document structurally, generically, or abstractly, then it can be
derived from the instance. The traditional rationale for validation is to
insure that the document as received corresponds to the document type
expected, as expressed by its DTD, and there are plenty of cases where it is
appropriate to examine an instance on that premise. My more radical view is
that XML is inherently eXtensible through Markup, rather than by the expansion
or relaxation of the content model to accommodate new instance expressions.
Embracing well-formedness as the single criterion for accepting an instance
document, and in the workaday world thereby being obliged to process it in
whatever way your node processes documents of that class, has some surprising
consequences for those who expect the traditional logic of validation. Chief
among them is that it shifts most questions of whether to accept a document
from the parser to a more application specific processor. Having passed basic
well-formedness syntax checking, a document or a data structure within that
document must be accepted as what it says it is: if the markup says that it
describes a <payment>, then it is a <payment> precisely because it says so.
The question at that point is whether the particular form, as well as the data
content of that <payment>, can be accurately and usefully processed by the
existing application software at that node. That is a question which can be
answered only at that node, and not in the general or abstract case, but only
in light of that instance data. As is the usual case with application
software, the answer will depend upon specific error- and sanity-checking
routines determining whether that software can render a usable result from
that data in that instance environment.
In a *slightly* more generalized fashion, this is also what the PSVI is really
about: after well-formedness syntax checking, perhaps validation, document or
data structure type identification, and then demonstrated data schema
conformity, the data is rendered into an infoset which the processing software
expects. This is a valid model on which to create application software, but it
is not the only one--and often not the most efficient. It does illustrate,
though, that the procedure for building application software to a PSVI is just
that: a processing model must be determined, unwanted interaction of
components must be mediated, etc. The real question is whether we can ever
sufficiently (and reliably) generalize the operations of such software, to the
point where it can be fed a single generalized
post-parse-and-other-preparations data model. I chose to work with XML because
it frees me from having to know what the processing, beyond basic
well-formedness syntax checking, might be at a remote node--which is just as
well because most of the remote nodes with which I exchange data (often
indirectly) won't tell me (keeping it secret is their business edge), and some
of them don't know who I am, nor that I am the source or a downstream user, of
data which they consume or emit.
Respectfully,
Walter Perry