OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML and special Characters : unicode v3.0 ?

[ Lists Home | Date Index | Thread Index ]
  • From: Tim Bray <tbray@textuality.com>
  • To: John Cowan <cowan@locke.ccil.org>, XML Dev <xml-dev@ic.ac.uk>
  • Date: Mon, 01 Mar 1999 11:25:32 -0800

At 02:09 PM 3/1/99 -0500, John Cowan wrote:
>Timothaeus Bray scripsit:
>
>> [D]id you know the BOM was legal in UTF-8?
>
>The BOM isn't just a BOM, it's also the ZWNBSP (zero-width
>non-breaking space; no, I do not know how to pronounce that
>acronym) character, and is interpreted as a BOM only at the
>beginning of UCS-2 or UTF-16 documents.  Not to worry; the character is
>as near to a no-op as Unicode allows for.

I think there is reason for worry.  In a UTF-16 document, you can
have a BOM and then the <?xml version=?>, and that PI will still
be recognized as the XML declaration.  The spec is, I think,
pretty clear, that a ZWNBSP or any other *data* character before
the XML declaration is verboten.  So... it seems that in UTF8,
a ZWNBSP as first character in the file isn't a data character.
Blecch.

 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS