OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Penance for misspent attributes

[ Lists Home | Date Index | Thread Index ]

The comments on lack of arrays came from discussions 
in the X3D design.  The X3D designers needed to remain 
consistent with the original VRML design.  I pushed that 
over to XML-Dev, but the thought wasn't original to me.

We have locals who have assured programmers that 
if they stick with elements, they won't have to learn 
about attributes because "they can't handle them".  In 
another bay, we have the "DOM-only" programmers who 
believe that DOM is The Way and that SAX shouldn't 
be bothered with.   We have marketing types who 
have assured customers that they can safely edit 
the XSLT without worry.   Then we have that report 
of a program using XSLT, all elements, no SAX, and 
all DOM that seems to be slowing down a system.

Experience is more expensive than books.

The reinvention seems to satisfy the urge of people 
to put their names on things.  The web spawned the 
15 second celebrity phenomenon.  It warps deep and 
careful thinking.

len


From: Arjun Ray [mailto:aray@nyct.net]

"Bullard, Claude L (Len)" <clbullar@ingr.com> wrote:

|> [1] http://www.sgmlsource.com/history/AnnexA.htm

The other stuff in the SGML History Niche is worth reading too:

 http://www.sgmlsource.com/history/

| One might be called a Markup Luddite for adhering to what it describes, 
| but that is indeed the thinking that spawned the seminal work in markup 

And, IMHO, right on the money.  There's also a difference between being a
Markup Luddite and an ISO8879 Luddite.  ISO8879 gets a number of niggling
details wrong (eg data content notations for elements and the primary
status of ID attributes), but for the most part there are a number of
powerful concepts in SGML that the wholesale (NIH-driven?) impulse to
reinvent in the XML world is unnecessarily neglecting.

Much of that, I think, has to do with the generally underacknowledged
impact of existing software on ways of thinking.  If the software doesn't
support something, the tendency is to ignore it or to find workarounds,
and in any case, not to investigate too closely what it fails to support.
The downstream effect is an implicitly constrained approach to design
issues where everything is organized around what the software does happen
to support, only.  For instance, Simon wrote:

: A lot of people have been storing data in attributes rather than in
: element content.  There are lot of reasons for this, ranging from a more
: compact form to simpler processing in SAX.  (Attributes are presented as
: a convenient group, while you have to wait for child elements)

Hidden here is (IMHO) a deep critique of SAX as a useful API.  Using my
parsing theory parallel again, SAX is basically a (f)lex, a generator of
"interesting" tokens.  There is no yacc/bison analogue to *complete* the
parsing process, in particular to realize this statement from Annex A:

>   If, as postulated, descriptive markup like this suffices for all 
>   processing, it must follow that the processing of a document is a 
>   function of its attributes.

The missing piece is attributes to drive context sensitive processing.
The GI is not the only relevant attribute!  (On the contrary, it's the one
that has *already* been used to drive the order of "events", seen in the
fact that SAX guarantees element-start and element-end notifications in
proper stack order.)  Using my parsing techniques parallel once again,
there are two aspects to this - roughly corresponding to the needs of
bottom-up and top-down processing.

First, bottom-up, actions are normally associated with reductions.  The
element-end event in SAX actually conflates two separate semantic needs:
an element handler completing its own processing, and assimilation of the
completed element in the context of its parent - where often the parent
needs a *semantic summation* of the child ("synthesized" attributes).

Second, top-down, context sensitive processing often means invoking child
handlers according to inherited state.  if you're going to get useful
semantic information on the way back up, you'll need to pass parameters on
the way down.  This, I submit, is the essential function of SGML/XML
attributes: parametrization of how child element processing is invoked. 
Generally, this requires inspection of attributes in an element-start
handler to dispatch appropriate processing, and also points to the wisdom
of SGML elaborating on *types of names* in the concept of declared value,
because names are *the* way to signify or denote requirements.

[Strongly typed languages like Java are somewhat at a disadvantage here.
The element-start and element-end callbacks have 'void' signatures because
only generic (and thus useless) Objects can round-trip through the parser
back to the application.  Otherwise, one could imagine dispensing with the
setContentHandler() interface, and having stuff like this (in pseudo-code)

  interface  ContentHandler {
       ContentHandler  elementStart ( <start-tag-stuff> ) ;
       UsefulObject    elementEnd ( ) ;
       ContentHandler  childFinished ( <childref>, usefulObject ) ;
       ContentHandler  text ( data ) ;
  }

where each element gets to set the handlers for its children (which it
will decide based on - what else - attributes) and also receive a "child
is done" event, on par with character data notifications, where it gets
another chance to direct things - as arguably it should.]

But critiquing SAX is not to single it out for blame (if any).  The fault
is systemic, extending to things like the Infoset.  I think it was you,
Len, who had reservations way back, when the Infoset didn't have lists
(arrays).   And lo, even in XSLT getting the list of tokens in a NAMES or
NMTOKENS or ENTITIES attribute needs an extension function!  It ain't
there, so people won't use it, don't use it, and never find out better.

Attribute-based processing has yet to happen in the XML world, and without
it, much of the point of generalized markup is lost. 




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS