Lists Home |
Date Index |
Tim Bray scripsit:
> > Where the character set information is explicitly marked, such as in
> > UTF-16BE or UTF-16LE, then all U+FEFF characters, even at the very
> > beginning of text, are to be interpreted as zero width no-break
> > spaces.
> It's worse than that. Last time I checked, the media-type RFC for
> UTF16-LE and -BE *forbade* the use of a BOM entirely.
You're saying the same thing in different words. UTF-16BE and UTF-16LE
forbid the use of a BOM, and therefore any U+FEFF that appears is necessarily
a ZWNBSP. (This use of U+FEFF is deprecated as of Unicode 3.2, however).
Naturally, XML files can't begin with a ZWNBSP.
John Cowan http://www.ccil.org/~cowan email@example.com
To say that Bilbo's breath was taken away is no description at all. There
are no words left to express his staggerment, since Men changed the language
that they learned of elves in the days when all the world was wonderful.