OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

UTF-8 BOM



The W3C XML Core WG is considering the question of whether a UTF-8
byte-order make (BOM) is allowed at the start of an XML entity.  This
question was raised a few weeks ago in a thread on comp.text.xml
starting at article

  <180520011620538217%andreas.prilop@altavista.net>

We would like to determine how existing parsers handle the byte
sequence #xEF #xBB #xBF when it appears at the start of an XML
document or other entity.  Is it treated as a BOM (and not part
of the text of the entity) or as a zero-width non-breaking space
character?

We have placed a number of test cases at

  http://www.cogsci.ed.ac.uk/~richard/bomtest/

and would be grateful for feedback on how parsers handle them.  Please
post results here in xml-dev to avoid unnecessary duplication.

We would also like to know of any editors (or similar tools) that
generate XML documents starting with a UTF-8 BOM.

-- Richard (on behalf of the XML Core WG)