[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: UTF-8 BOM

From: Michael Brennan <Michael_Brennan@allegis.com>
To: 'David Brownell' <david-b@pacbell.net>,Richard Tobin <richard@cogsci.ed.ac.uk>, xml-dev@lists.xml.org
Date: Tue, 03 Jul 2001 13:08:37 -0700

> From: David Brownell [mailto:david-b@pacbell.net]

<snip/>

> I guess I'm thinking that a UTF-8 BOM would be a "new feature" that's
> an error today.  Hence it fits with the other 
> backwards-problematic stuff
> in Blueberry ... though it's a "new feature" that's encoding-specific.

Except that a UTF-8 BOM isn't really a new feature; it's just one that all
too many implementors overlook.

The XML 1.0 specification includes a non-normative appendix regarding
autodetection of character encodings. It quite explicitly mentions the UTF-8
BOM as one of the things a processor should look for
(http://www.w3.org/TR/2000/REC-xml-20001006#sec-guessing-no-ext-info).
Unlike the issue with Blueberry, this isn't something new that's been added
to Unicode since XML 1.0. It's just a failure of current implementations.

Follow-Ups:
- Re: UTF-8 BOM
  - From: Rob Lugt <roblugt@elcel.com>

Prev by Date: redefine
Next by Date: XML DOM
Previous by thread: Re: UTF-8 BOM
Next by thread: Re: UTF-8 BOM
Index(es):
- Date
- Thread