Lists Home |
Date Index |
At 02:02 PM 4/23/2003 -0400, John Cowan wrote:
>Dan Vint scripsit:
> > I swallowed hard on the idea of well-formed documents, but have learned
> > to live with that, but now not even being able to have a standard way to
> > determine if this XML file is supposed to be compliant with a DTD or
> > schema is almost too much to accept.
>It's a different model. An SGML or XML DTD is *logically inside* the
>document (even if it's physically outside through the use of entities;
>XML puts some restrictions on DTDs when physically inside, but that
>doesn't affect the point), and so validation answers the question "Is
>this document self-consistent?"
I understand the difference in the models for processing, but the fallout
of the DTD process was that I identified as the creator of the document how
I intended for it to be processed and potentially which version of that DTD
that I wanted used. Theoretically, the DTD was always an out side file and
it had a unique public identifier that I referenced for this purpose.
I was looking for that similar identification. Len talks about the
"contract" all the time, to me just having the data without a reference to
what I intended it to conform to is not much of a contract. Also what
happens with all this legacy stuff and loose files? Ok I have an XML
stream, but what good is that too me if I don't know what I was supposed to
manage it with? I get and build XML files all the time and NEED to have the
reference to something just so I can remember what I was working with. It
doesn't take more than a week of inactivity to forget what file Z was used for.
I was sort of looking at the targetNamespace as providing some of the
benefit of the public identifier if you followed the process of putting a
version number in the URL and changed it with each significant change. I
was also looking for something that would differentiate a well-formed
document (maybe not even built to a DTD or Schema) from one that was built
for a schema. In the case of a schema based document I was expecting a
targetNames, schemaLocation or nanamespaceSchemaLocation to at least flag
or trigger schema based processing.
>A WXS or RNG or Schematron schema, like an architectural meta-DTD, is
>*logically outside* the document, and validation against it answers the
>question "Is this document consistent with this schema?" This entails,
>of course, that there might be more than one schema with which the
>document is consistent. That being so, there can be no exclusive means
>of referring to *the* schema against which a document is to be validated.
But how many people are working with more than one schema? Even if they are
wouldn't it be good to come up with a method to relate all the schemas
together and have a universal identifier assigned that is then tracked in
the document? Sort of a hybrid use of a public identifier and a catalog to
manage this stuff?
Maybe the big difference is that SGML/XML was built for documents and they
are intended to live more than the nano-second needed to send it across the
wire and it is never stored or referenced in that form again. Where a
document is not so transient and is stored in many places on my hard drive
and different systems. And I will go back to some of those files over
several years (or at least days) time. It would have been nice if some of
this functionality was allowed.
>Some people open all the Windows; John Cowan
>wise wives welcome the spring firstname.lastname@example.org
>by moving the Unix. http://www.reutershealth.com
> --ad for Unix Book Units (U.K.) http://www.ccil.org/~cowan
> (see http://cm.bell-labs.com/cm/cs/who/dmr/unix3image.gif)