OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: Grove diagram (was Re: The Power of Groves)

[ Lists Home | Date Index | Thread Index ]
  • From: "W. Eliot Kimber" <eliot@isogen.com>
  • To: Peter Murray-Rust <peter@ursus.demon.co.uk>
  • Date: Thu, 10 Feb 2000 13:08:08 -0600

Peter Murray-Rust wrote:

> >SGML, not XML, not that that matters much.  The picture is here:
> >
> >  http://www.ltg.ed.ac.uk/~ht/grove.html
> Thanks Henry - I had just managed to find it (on XML-DEV). You beat me to it.
> Would it be simpler in XML? Since - presumably - XML can have a simpler
> property set maybe there would be fewer components in the diagram?

I don't really see how. The picture Henry produced shows the *minimum*
properties of interest, which appear to be identical for XML.

If you look at the DOM, for example, it simplifies by collapsing some
properties that are managed as node lists in the grove into simple
strings (attribute values and PCDATA content), but that useful
simplification doesn't change the reality that a complete abstract
representation has to do provide the individual nodes. Note also that
the groves have a built-in "data" property for nodes that gives the
effect of the DOM's string-only shortcut.

For example, in the SGML/XML grove, a tokenized attribute value (one
declares as IDREFS, for example) is represented as a node list of token
nodes. This exactly reflects the syntax and semantics of attribute
values but is not very convenient when all you want is the full
attribute value string. However, if you ask for the "data" property of
the attribute node, you will get the string value, which is just what
you want.  In Python this looks like this:

# Variable elem is an element node:

atts = elem.AttSpecs  # Get the attributes specified for this element
for att in atts:
  attval = att.data()  # Get the value of the data property of the
attribute specification node
  print "value='%s'" % attval

Using normal DPH techniques, I'd probably get tokens back by using

  if att.Name == "refids":
      ids = string.split(attval)
      for id in ids:
         print "id='%s'" % id

But, if I want to *address* a particular token, the token nodes are
there and I can address them reliably.

In a DOM context you'd probably say "but nobody will ever need to do
that". Probably right, so in the *implementation*, don't bother to
expose that complexity. But a standard data model like the SGML property
set can't presume to know the full requirements of potential users, so
it has to be more complete (and therefore complex) than a purely
pragmatic specification would need or want to be.

It's important to remember that XML simplified SGML *syntax*. It did not
(and could not) simplify the underlying abstract data model (except
where the SGML data model reflects optional features, like attributes
for notations).

This is why the idea that XML is *fundamentally* simpler that SGML is a
Big Lie. The cost of entry is lower, but the total cost of ownership is
essentially the same.




News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS