[
Lists Home |
Date Index |
Thread Index
]
- To: xml-dev@lists.xml.org
- Subject: Re: BOM requirement in UTF-16
- From: Richard Tobin <richard@cogsci.ed.ac.uk>
- Date: Sun, 16 Mar 2003 18:31:41 GMT
- Cc:
- In-reply-to: <UXVca.124745$qi4.62176@rwcrnsc54>
- Organization: HCRC, University of Edinburgh
- References: <Xns933E11D6E632Bgustafl@127.0.0.1> <qKzca.89518$3D1.3540@sccrnsc01> <b50b5q$1eku$1@pc-news.cogsci.ed.ac.uk>
>If a BOM appears, it determines the encoding.
According to which standard? Unicode says (section 13.6):
Where the character set information is explicitly marked, such as in
UTF-16BE or UTF-16LE, then all U+FEFF characters, even at the very
beginning of text, are to be interpreted as zero width no-break
spaces.
>XML's whitespace vocabulary is very limited. Such a character is not
>allowed in an XML document, so the document would not be well-formed.
You're right, it would not be allowed at the start of a document
because it is not an XML whitespace character. (It is allowed in text
content however.)
-- Richard
|