OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Michael Brennan wrote:

> The XML 1.0 specification includes a non-normative appendix regarding
> autodetection of character encodings. It quite explicitly mentions the
> BOM as one of the things a processor should look for
> (http://www.w3.org/TR/2000/REC-xml-20001006#sec-guessing-no-ext-info).
> Unlike the issue with Blueberry, this isn't something new that's been
> to Unicode since XML 1.0. It's just a failure of current implementations.

I totally agree with you Michael.

I don't know which implementations haven't taken account of the (very
useful) Appendix F, but the ElCel Technology C++ Toolkit (from which the XML
Validator is built) certainly recognises and accepts a UTF-8 BOM as the
Appendix suggests it should.

Perhaps the current confusion arises from the non-normative nature of the
Appendix.  However, I take it as a clear indication that a UTF-8 BOM is


Rob Lugt
ElCel Technology