OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: [xml-dev] NMTOKENS: good or evil?

[ Lists Home | Date Index | Thread Index ]

I think the tradition of using space-separated lists of tokens arises because there's no other way of structuring an attribute, and in some vocabularies there are strong reasons for using an attribute. Also, some people just can't stomach the verbosity of wrapping element tags around every value in a long list. And it sometimes arises when you have an existing vocabulary with a single-valued attribute and you want to extend it to make the attribute multi-valued in a backwards-compatible way.
So I share your ambivalence. Like many aspects of XML document design, there's no right and wrong answer.
XSLT 2.0 and XPath 2.0 have no problem handling list-valued elements and attributes, especially if you have a schema that describes them as being list-valued (for example, as xs:NMTOKENS). I agree that it's difficult with 1.0.
Michael Kay

From: Laurent Le Meur [mailto:laurent.lemeur@afp.com]
Sent: 15 February 2006 18:15
To: newsml-2@yahoogroups.com; xml-dev@lists.xml.org
Subject: [xml-dev] NMTOKENS: good or evil?

The IPTC NewsML2 Working Group (www.iptc.org/dev) has tried to avoid NMTOKENS is drafting its specification.


The reason is that using a list of space-separated values is not really conformant to “the XML way” (it is à priori sensible to have a sub-element for each enumerated value). And that we don’t know about an easy way to process them using XSLT, XPATH or DOM for example. The problem seems rooted in XML parsers themselves, which usually treat them as strings (and not as collections of tokens) [1].


But in fact HTML – so broadly used – defines many NMTOKENS (eg class). HTML processors are everywhere. So it must not be so difficult to process them.


We have already adopted in NewsML2 one HTML (& CSS) attribute – media - which specifies the target media type(s) of a label, and is defined as NMTOKENS in HTML.


Ex of use:

<title media="handheld screen">Hello world</title>


We are more and more ambivalent about NMTOKENS: using attributes to convey collections of values could be interesting in many use cases, making the NewsML2 syntax more compact:


For example we could imagine replacing something like:

<subject type=”type:organisation” code=”nasdaq:MFRT”>

            <sameAs code=”isin:23234”/>

            <sameAs code=”sicovam:23234”/>

            <title>The Company</title>




<subject type=”type:organisation” code=”nasdaq:MFRT” altcodes=”isin:23234 sicovam:ERT-345”>The Company</subject>


We need the help of the community to get clear ideas about questions like:

-          What are the problems for developers of XML processors when reading NMTOKENS and when editing them?

-          How shall implementers process NMTOKENS ?

o        If they use DOM?

o        If they use XSLT?  

o        If they use XPATH?


Laurent Le Meur

IPTC, NewsML2 Architecture WP chair




[1] http://lists.xml.org/archives/xml-dev/199806/msg00509.html  
“One other problem with the above syntax is the use of NMTOKENS for the 
enumeration.  An earlier objection on this list pointed out that most parsers 
today return a string of Nmtoken's, rather than an array, which forces the 
application to parse the string themselves.” (rbourret)



This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individ ual or entity to whom it is addressed. If you have received this email in error, please contact the sende r and delete the email from your system. If you are not the named addressee you should not disseminate, distr ibute or copy this email.

For more information on Agence France-Presse, please visit our web site at http://www.afp.com


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS