[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSVI

From: Charles Reitzel <creitzel@mediaone.net>
To: xml-dev@lists.xml.org
Date: Mon, 05 Mar 2001 10:50:32 -0500 (EST)

On Fri, 02 Mar 2001 09:42:47 -0700, Uche Ogbuji wrote:

>Parsing should be parsing and there should be a "canonical" 
>infoset as a result.

and

>If there were a clean and well-mapped pipeline of XML
>processing I think a lot of the concern over intertwined 
>yet out-of-sync specs and maybe even PSVI infiltration 
>could be satisfactorily resolved.

Couldn't agree more.  Also looks like a lot of the details have been
discussed.  I see a pent up demand for (what I think of as) an XML-only DOM.
A.k.a the original Infoset.  I know I was disappointed when the WG backed
off and included only known overlaps.  I appreciate why they did what they
did, however.  Call it Groves, if you like.  Back when folks like Paul
Prescod and Eliot Kimber were clamoring for such a thing, I didn't
understand the need.

I liked Simon's layering:

    standalone
        => instance+DTD 
            => instance+DTD+Schema

where
    instance includes:
        namespace declarations

    DTD adds:
        #FIXED values,
        default attribute values,
        Entities
        DTD validation

    Schema adds:
        default+fixed attribute values
        default+fixed element values
        datatypes! ( conversion, null/not null, 
                     equality, ordering, ... )
        schema validation

I'm sure I missed some.  I don't see why the original standalone="yes|no"
doesn't continue to do the job.  If standalone=no, then DOCTYPE,
schemaLocation, namespace URIs and/or RDDL can point to the necessary extra
ingredients.  These additional pieces may or may not be retrieved from
local, possibly secure, caches.

Yes, it all adds to a WF instance.  Essentially, what standard layering buys
you is you know what is or is not included in that instance.

At 05:45 PM 3/1/01 -0500, David Megginson wrote:
>My preference is to include all defaulted information in
>production instances and not to include a DOCTYPE 
>declaration or schema link at all -- that's the only way 
>to ensure that all clients see the same document, and that 
>no one does anything stupid.

In the absence of some consensus of how the layers add up, this is a
sensible, if defensive approach.  I do it, too.  While it may sometimes be
suitable to resolve all DTD+schema data into production instances, it will
not always be desirable.  In RPC applications, schemas will allow
significant instance size reduction.  E.g., embedding schema data type
information in instance docs can be avoided.

In RDBMS-speak, such "denormalization" causes "update anomolies".  The right
approach will depend on the life cycle of the documents in question.
Standard layering lets us be a bit less defensive and eases instance
maintenance.

take it easy,
Charles Reitzel

Prev by Date: Re: We need an XPath API
Next by Date: Re: RSS 1.0 vs. RSS 0.9*
Previous by thread: Re: PSVI
Next by thread: Re: PSVI
Index(es):
- Date
- Thread