OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SAX2 ... missing features?

> > > * DTDInputSource
> > > An application can set this property to provide a DTD.  ...
> >
> > This one is interesting because it clearly can't be layered:  swapping DTDs
> > means changing entity and attribute declarations, which affect the view of
> > content produced by parsing the body.
> >
> > Though I don't think "InputSource" is the right model, since it doesn't
> > support use of internal subsets ... One really wants the three components of a
> > DTD:  root name decl, "external subset" system ID (and maybe public
> > ID), and internal subset.
> The interface you suggest is the one you employed for your ValidatorConsumer
> [1] which you mentioned in a post yesterday.

OK, so given a choice I'm consistent ... :)

> Like all things, the most appropriate interface depends on the context in
> which it will be used.  But on balance I think the InputSource approach is
> most flexible.  My reasons for this are:-
> - Specifying the root name decl can be problematic when validating multiple
> documents of different types.  

That decl is one of the three parts of the Document "Type" Declaration though.

>    Our XML Validator [2] enables the user to
> specify a DTD URL on the command line as well as a list of files to
> validate.  The xml files may contain different root elements, yet they could
> all be valid with reference to the supplied DTD.  

No they can't, because they won't test the Root Element Type VC,
which is supported by the DTD.  (Not external or internal subsets;
but by the whole DTD!)

> - For packaging reasons, applications may want to keep a private, in-memory
> copy of an entire DTD.  The InputSource approach allows this to be passed to
> the xml parser as a StringReader.  Your approach also allows this, but in a
> more restricted way.  

I don't see how supporting the _full_ DTD functionality can ever
be "more resticted" than providing part of it (only external subsets).
But it's morning, maybe I'm being dense.

> - The systemId/publicId ultimately resolve to an InputSource anyway.  By
> providing the InputSource directly, the application is short-cicuiting the
> EntityResolver.  I believe this is a reasonable thing to do, but I'm open to
> arguments as to why this may be inappropriate.

Well, for the "external subset" portion of a DTD, one could discuss
such things.  Not for the "internal subset" (which you didn't address
in your response), or the declaration of the root element type (which
you "can't see any sense at all in").

If one really wants a single object to pass in to the parser to describe
the DTD, it should include all three parts of the DTD.

- Dave