OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   W3C XML Core WG requests comment: control characters in XML 1.1

[ Lists Home | Date Index | Thread Index ]

This is a request for comment from this mailing list (or anyone else)
on a proposal by Shigemichi Yazawa for a standard representation for
the Unicode control characters that are not legal in XML 1.0.  See

In essence, this provides an element "<xml:orphanedChar value="#x0001">"
which can be used *by convention* in place of an actual (and illegal) #x1
character.  The Infoset would view this as an element, not a character; it
would not be usable in attribute values; it is not fully general-purpose.
It would also require explicit declaration in schema languages, unless
they were modified to ignore it; even then, an element with an XSD
datatype would not be able to use this feature.

An alternative proposal is to use a processing instruction such as
"<?xmlchar #x1?>", which would be invisible to schemas.  A little *too*
invisible, in some cases: it would be legal in simple datatypes, but a
string-typed element containing 3 characters could not contain 3 control
characters and still be schema valid.

The idea is certainly a hack.  However, it may meet the use case
of people who wish to incorporate arbitrary Unicode strings into
XML character content by providing something that may meet the 80/20
requirement.  Whether it *does* meet the 80/20 requirement is what we
chiefly want to know.  Please make sure that all comments are cc-ed to

John Cowan <jcowan@reutershealth.com>     http://www.reutershealth.com
I amar prestar aen, han mathon ne nen,    http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS