[
Lists Home |
Date Index |
Thread Index
]
- From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
- To: xml-dev@ic.ac.uk
- Date: Mon, 05 Apr 1999 11:20:27 +0900
Chris Lilley wrote:
> The vast majority of content authors have *no control whatsoever* on
> server configuration. This isn't 1993; assuming that the person who
> wrote the content is also the person who administers the server is
> totally unwarranted.
To overcome this problem, Uchida-san is proposing a convention for WWW server
configurations. His proposal is already used by some ISPs in Japan. It is
available at:
http://www.asahi-net.or.jp/~sd5a-ucd/docs/suffix_guideline_981106.txt
It is hoped that this note will finally become a W3C technical note and that
the I18N WG will encourage people to use it.
Chris Lilley wrote:
>
> But not necessarily everyones favourite. It is a good choice for
> Japanese, because Kanji use less bytes per character in UTF-16 than in
> UTF-8.
>
> > (In the case that the charset is broken, autodetection of
> > UTF-16 is very easy.
>
> But autodetection should not be required; users can label their
> documents correctly.
To me, the biggest advantage of UTF-16 is that UTF-16 XML documents can parse
only as UTF-16. Even if the charset parameter is incorrect, UTF-16 XML documents
do not parse incorrectly (and error recovery is very reliable).
Chris Lilley wrote:
> On the other hand, if the RFC had been written as I suggested, saying
> that a charset parameter overode *if present* but that *if absent*, the
> rules in the XML recommendation were followed, then you would need no
> server reconfiguration and the rules to follow to have the encoding
> information correctly conveyed to the client would have been a matter of
> public record in the XML recommendation rather than private convention.
> A big win for interoperability, if that had happened.
At *IETF*, the default of the charset parameter for text/HTML *is* 8859-1.
You might want to change this first. It is going to be very difficult or
impossible, since HTTP and MIME people will disagree.
Chris Lilley wrote:
>
> On the other hand, if the RFC had been written as I suggested,
There have been a lot of discussion about this issue. None of your arguments
are new to me. In fact, my original opinion was not so different from yours but
I have changed my mind during the discussion. More about this, see the archive
of the XML SIG (around April and May of 1998).
> Murata-san, you asked why a W3C team person was criticising this RFC in
> public. It is because the mission of W3C is to improve interoperability,
> so it is my duty to do so.
You might want to check what the W3C I18N WG has said to the XML CG. If
W3C strongly recommends the use of the charset parameter, the world will
change. XML is the last chance. I am strongly advocating the use of the
charset parameter in Japan whenever possible. On the other hand, if even a
W3C team member does not respect the consensus, there is not much hope.
Cheers,
Makoto
Fuji Xerox Information Systems
Tel: +81-44-812-7230 Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|