OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: BOM requirement in UTF-16

[ Lists Home | Date Index | Thread Index ]
  • To: xml-dev@lists.xml.org
  • Subject: Re: BOM requirement in UTF-16
  • From: Richard Tobin <richard@cogsci.ed.ac.uk>
  • Date: Sun, 16 Mar 2003 18:31:41 GMT
  • Cc:
  • In-reply-to: <UXVca.124745$qi4.62176@rwcrnsc54>
  • Organization: HCRC, University of Edinburgh
  • References: <Xns933E11D6E632Bgustafl@> <qKzca.89518$3D1.3540@sccrnsc01> <b50b5q$1eku$1@pc-news.cogsci.ed.ac.uk>

>If a BOM appears, it determines the encoding.

According to which standard?  Unicode says (section 13.6):

  Where the character set information is explicitly marked, such as in
  UTF-16BE or UTF-16LE, then all U+FEFF characters, even at the very
  beginning of text, are to be interpreted as zero width no-break

>XML's whitespace vocabulary is very limited.  Such a character is not
>allowed in an XML document, so the document would not be well-formed.

You're right, it would not be allowed at the start of a document
because it is not an XML whitespace character.  (It is allowed in text
content however.)

-- Richard


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS