The IPTC NewsML2 Working Group
(www.iptc.org/dev) has tried to avoid NMTOKENS is drafting its specification.
The reason is that using a list of
space-separated values is not really conformant to “the XML
way” (it is à priori sensible
to have a sub-element for each enumerated value). And that we don’t know about
an easy way to process them using XSLT, XPATH or DOM for example. The problem
seems rooted in XML parsers themselves, which usually treat them as strings
(and not as collections of tokens) .
But in fact HTML – so broadly used
– defines many NMTOKENS (eg class). HTML processors are everywhere. So it must
not be so difficult to process them.
We have already adopted in NewsML2
one HTML (& CSS) attribute – media - which specifies the target media
type(s) of a label, and is defined as NMTOKENS in
We are more and more ambivalent
about NMTOKENS: using attributes to convey collections of values could be
interesting in many use cases, making the NewsML2 syntax more
For example we could imagine
replacing something like:
type=”type:organisation” code=”nasdaq:MFRT” altcodes=”isin:23234
We need the help of the community
to get clear ideas about questions like:
What are the problems
for developers of XML processors when reading NMTOKENS and when editing
How shall implementers
process NMTOKENS ?
If they use DOM?
If they use XSLT?
If they use
IPTC, NewsML2 Architecture WP
“One other problem with the above syntax is the use of NMTOKENS for the
enumeration. An earlier objection on this list pointed out that most parsers
today return a string of Nmtoken's, rather than an array, which forces the
application to parse the string themselves.” (rbourret)