[
Lists Home |
Date Index |
Thread Index
]
- From: Chris Lilley <chris@w3.org>
- To: Tim Bray <tbray@textuality.com>
- Date: Mon, 05 Apr 1999 02:39:05 +0200
Tim Bray wrote:
>
> At 03:24 PM 4/4/99 +0200, Chris Lilley wrote:
> > But it need not autodetect, in fact, autodetection
> >is a bad thing. I was not suggesting autodetection, quite the converse.
> >
> >Rather, in the absence of an explicit MIME charset parameter, it should
> >use the encoding declaration. If there is none, then the document is in
> >UTF-8 or UTF-16 and the XML spec tells you how to determine which. [1].
>
> Just a terminology thing; I think when we say autodetection, we are
> talking about using the combination of the first few bytes and the
> encoding declaration, as described in app. F of the XML spec.
Thanks for pointing out this source of terminological confusion. No, I
was not meaning that.
I was meaning autodetection in the sense of reading a whole bunch of the
text and making assorted guesses based on frequency analysis and the
like. In other words, automatic detection based on unlabelled content. I
believe that this is a bad thing, because there is always the
possibility (quite high) of hgetting it wrong.
The encoding declaration, on the other hand, is not autodetection in
that sense, it is a label. A very small amount of autodetection has to
be done in order to be sure that the label has been read, that is all
(ie, is this UTF-16 or is this an encoding where ASCII is represented as
ASCII).
> I think
> (and I thought Chris thought) that this is a *good* and necessary thing,
> if only because lots of XML documents are read in other ways than via
> http, and because lots of times the web server simply doesn't/can't
> know about the internal arrangements of some XML resource.
Yes, this (rewading the encoding declaration) is a *good* thing, with
the proviso that I am talking about the encoding declaration. I don't
consider this autodetection, in thre same sense that reading <?xml
version="1.0"?> is not autodetection of the version.
--
Chris
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|