OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML and special Characters : unicode v3.0 ?

[ Lists Home | Date Index | Thread Index ]
  • From: John Cowan <cowan@locke.ccil.org>
  • To: XML Dev <xml-dev@ic.ac.uk>
  • Date: Mon, 01 Mar 1999 14:09:46 -0500

Timothaeus Bray scripsit:

> [D]id you know the BOM was legal in UTF-8?

The BOM isn't just a BOM, it's also the ZWNBSP (zero-width
non-breaking space; no, I do not know how to pronounce that
acronym) character, and is interpreted as a BOM only at the
beginning of UCS-2 or UTF-16 documents.  Not to worry; the character is
as near to a no-op as Unicode allows for.

> And of course by the fact that Unicode/10646 is a moving target.

Only sort of.  8859-1 is theoretically a moving target too, except
that all the slots are full; CP 1252 is a moving target that has
just moved (by adding the euro at 0x80).  In all these cases, characters 
can be added (in principle) but not moved or deleted (any more).
 
> In practice,
> I've never actually seen anything outside of the BMP, but the
> experts agree they're showing up real soon now.

Not until Unicode 4.0, unless someone wants to use the private-use
planes 15 and 16.
 
> How to get it in? Something like &#x10333; I expect.

Exactly so.  Or the decimal NCR equivalent.  Two NCRs representing
the surrogates separately would be erroneous by both Unicode/10646
definitions and XML definitions.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS