[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: UTF-8 BOM
- From: Michael Brennan <Michael_Brennan@allegis.com>
- To: 'David Brownell' <david-b@pacbell.net>,Richard Tobin <richard@cogsci.ed.ac.uk>, xml-dev@lists.xml.org
- Date: Tue, 03 Jul 2001 13:08:37 -0700
> From: David Brownell [mailto:david-b@pacbell.net]
<snip/>
> I guess I'm thinking that a UTF-8 BOM would be a "new feature" that's
> an error today. Hence it fits with the other
> backwards-problematic stuff
> in Blueberry ... though it's a "new feature" that's encoding-specific.
Except that a UTF-8 BOM isn't really a new feature; it's just one that all
too many implementors overlook.
The XML 1.0 specification includes a non-normative appendix regarding
autodetection of character encodings. It quite explicitly mentions the UTF-8
BOM as one of the things a processor should look for
(http://www.w3.org/TR/2000/REC-xml-20001006#sec-guessing-no-ext-info).
Unlike the issue with Blueberry, this isn't something new that's been added
to Unicode since XML 1.0. It's just a failure of current implementations.