Lists Home |
Date Index |
I've been doing more thinking about the PSVI and poking around at ASN.1
and Web Services. Then I went outside to work on some drainage ditches,
and had a few more ideas.
It's striking me more and more that developers, myself included, have
done a poor job of examining and explaining how markup works and what
the parts do best. That extends to a key discussion which is generally
considered dull but radioactive: the elements/attributes distinction.
A lot of people have been storing data in attributes rather than in
element content. There are lot of reasons for this, ranging from a more
compact form to simpler processing in SAX. (Attributes are presented as
a convenient group, while you have to wait for child elements)
The problem with using attributes for data is that there is no direct
way to associate metadata with attribute content. There is a very easy
direct way to associate metadata with element content - it's called
attributes. Adding additional information about attributes requires
either external sources (DTDs, schemas) or XPath (which I believe XForms
uses) or various ad-hocery. Direct serialization of any of this gets
ugly very quickly.
There are a lot of other symptoms of this problem. Namespace issues
around unprefixed attributes are one, though unqualified is only a
problem if you assume the attributes are their own atoms of information,
not merely additional description/refinement of the element type. W3C
XML Schema has made this situation a bit crazier with the notion of
unqualified elements, and SOAP's made common practice of it.
To some extent, the misuse arose because attributes had features
(defaulting, free order, some types, enumeration) that elements didn't
have. W3C XML Schema condones those practices for attributes and
extends the same features to elements. Maybe this is an improvement,
maybe it isn't.
In any case, it seems like many of the PSVI-representation difficulties
could be relieved by a best practice of using elements for the
information contained in a document and using attributes exclusively to
provide additional information about the element.
Separating markup from content - and putting attributes squarely in the
markup side - seems like one means of at least alleviating the headache.
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!