[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
UTF-8 BOM
- From: Richard Tobin <richard@cogsci.ed.ac.uk>
- To: xml-dev@lists.xml.org
- Date: Thu, 14 Jun 2001 12:23:48 +0100 (BST)
The W3C XML Core WG is considering the question of whether a UTF-8
byte-order make (BOM) is allowed at the start of an XML entity. This
question was raised a few weeks ago in a thread on comp.text.xml
starting at article
<180520011620538217%andreas.prilop@altavista.net>
We would like to determine how existing parsers handle the byte
sequence #xEF #xBB #xBF when it appears at the start of an XML
document or other entity. Is it treated as a BOM (and not part
of the text of the entity) or as a zero-width non-breaking space
character?
We have placed a number of test cases at
http://www.cogsci.ed.ac.uk/~richard/bomtest/
and would be grateful for feedback on how parsers handle them. Please
post results here in xml-dev to avoid unnecessary duplication.
We would also like to know of any editors (or similar tools) that
generate XML documents starting with a UTF-8 BOM.
-- Richard (on behalf of the XML Core WG)