Lists Home |
Date Index |
- From: Jonathan Borden <email@example.com>
- To: John Cowan <firstname.lastname@example.org>
- Date: Wed, 02 Aug 2000 16:03:27 -0400
John Cowan wrote:
> Jonathan Borden wrote:
> > [The Infoset is s]ufficient
> > for what? Sufficient for the in-scope task of the XML Infoset WG.
> > Not sufficient as a 'full fidelity' abstract description a.k.a. XML
> > Set.
> Just to give an idea of how big a job a "full fidelity" property set is,
> consider the production S of the XML Recommendation, which matches one or
> whitespace characters (space, tab, CR, LF). There are, by my eyeball
> count, 74 instances of S in the production rules. In order to make the
> Infoset suitable for generating an exact replica of the original, *at
> 74 new information item properties would be required for the
> of whitespace alone!
I think that part of the problem may be that such a task would be difficult
using the ISO Property Set specification. Isn't one of the benefits of XML
that it is simple to write parsers, the old cut out 10% of the features to
reduce the parser complexity by 90%?
One of the ways I judge the appropriate language for a particular task is by
how difficult it is to write programs. Perhaps you have struck the core
difficulty with "Property Sets" and "Groves", that they *aren't* an easy way
to specify something. Imagine the trouble with more complex syntax, or
binary files. Indeed if this task is too difficult to do for the XML Infoset
WG, how can we expect mere mortals to adopt Property Sets?
How difficult was it to develop the RDFS model of the Infoset? Would this be
an easier task to extend to cover all of XML 1.0 + names?
Another approach might be to start with an abstract XML representation of an
XML parse tree, and define a subset via an XSLT transformation, e.g. define
an RDFS of an abstract XML parse tree. Define an XSLT transformation between
the abstract XML parse tree and the RDFS defined in the XML Infoset
appendix. The reason a transformation may be required rather than a mere
subset is that the relationship between and typing of nodes in an XML parse
tree is different from that in the Infoset or DOM (for example a whitespace
sequence between attributes can be represented by a whitespace node in a
parse tree). What I am getting at is that the rules for XML encoding aren't
So perhaps you've answered our question, in that "Property Sets" while "up
to the task" are not an easy way to get things done.
The Open Healthcare Group