[
Lists Home |
Date Index |
Thread Index
]
Tim Bray wrote:
> Only because such a revision is not politically viable. The only
> advantage of the +names approach is that it doesn't touch XML.
But because this is a new encoding (and there have been no successful
new encodings for years AFAIK),
it will take at best about 3-5 years minimum to have deployment as part
of standard distributions
such as Java etc, depending on the attitude of the vendors, and vendors
such as MS and Sun probably see
it as a waste of time not fitting in with their Unicode strategy and tools.
So the only likely implementation route is for parser writers to add it
(or for implementers to
add it to entity management) on a product-by-product basis. But if you
have a majority of
parser vendors supporting it as an XML add-on, you already have the
quorum for getting
an XML revision.
So arguments for it on the basis of realistic pragmatism don't make any
sense to me.
Adding together the W3C HTML/XHTML people + the W3C Schema people
+ the MathML people + the XSLT people (all of whom have language that
are being
held back by a named character references being tied to DTDs) + the I18n WG
gives a group hardly without any policital clout in the W3C. This is a
very different
issue to the Unicode upgrade issue of 1.1.
Furthermore, adopting XML's entity or NCR mechanism without also adopting
a header mechanism for non-XML uses is allow in-band signalling that
that encoding
is currently in use is positively damaging, because it creates a dialect
of UTF-8 that can
only be detected by some who knows that the data may be using this
convention checking to see whether it has things that look like delimiters
and judging that they are being used as delimiters.
At the moment, life is simple: you can look to see the byte patterns in a
file and know that it is UTF-8: there is very little chance of a
misdiagnosis
because no other encoding really has the same modified Huffman signature.
I don't know why on earth we would want to put ourselves in the same
kind of position
as the Japanese have with text: they have a couple of alternate mappings
in some
vendors' versions of various encodings which adds complication.[1] Why
would we
want to get a similar situation?
Cheers
Rick Jelliffe
[1] http://www.w3.org/TR/2000/NOTE-japanese-xml-20000414/
|