OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [SML] Re: SML ?!?

[ Lists Home | Date Index | Thread Index ]
  • From: Robin Cover <robin@isogen.com>
  • To: James Tauber <jtauber@jtauber.com>
  • Date: Fri, 26 Nov 1999 11:35:10 -0600 (CST)

On Fri, 26 Nov 1999, James Tauber wrote:

> > See http://www.xml.com/pub/1999/11/sml/index.html for an article
> > describing the SML idea

> I noted with interest (and disagreement) the technical arguments against
> attributes.

I had a similar feeling about the 'arguments against attributes' - I think
it drew a conclusion without having engaged the matter adequately.

1. Yeah, the SGML/XML notion of an "attribute" is badly broken
2. A markup language used in part as a data modelling language certainly
   should be able to distinguish notationally and conceptually between
   an object ('element') and an attribute.
3. Just throwing out the SGML/XML attribute isn't the right solution,
   in my judgment.

> Why? Because in *markup* there is a distinction between content and markup.
> The character data content of an element is content. The value of an
> attribute is markup. Attributes, like other markup, provide information in
> addition to the textual content.
> For example, a person thinking how to express the fact that Max is a dog
> that is black might use:
> <dog>
>     <name>Max</name>
>     <colour>black</colour>
> </dog>
> However, a person wanting to markup the text "Max" indicating that he is a
> black dog couldn't do the above. They might, instead, use:
> <dog colour="black">Max</dog>
> So if XML is being used for marking up existing textual content, attributes
> have a definite place.

Even this example, while instructive and illustrative, does not begin to
address the deeper issues.  For example, whatever the SGML/XML standards
may say formally about "content" versus "non-content", and irrespective
of whether these definitions accord well with users notions of "content"
in different application domains (a HUGE usability concern), we have in
TEI for example, a markup strategy for correcting a known or suspected
error in some text, with sic/corr tags:

I write:  As Job Bosak astutely observed
You encode:  <p>As <corr sic="Job">Jon</corr> Bosak astutely observed

(or something like this -- see TEI's mirror tags)

In practice, trying to declare in advance what may be reckoned as
"content" is extremely difficult (attribute value literals in some
contexts but not others; most all PCDATA but not all, etc.)

The definitions I've seen are break very quickly, when one considers
the range of applications and users WRT SGML/XML: it's not "what
is seen vs. what is not seen"; nor "what's really in the 'text'
vs. what is metadata".  All such distinctions I've seen are
broken, and non-repairable.  Especially if one is concerned to
uphold the notion of a descriptive markup language as having
no pre-defined application level processing semantics.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS