OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: participating communities (was XML Blueberry)

Bullard, Claude L (Len) explained:

> He isn't saying that.  He is asking for proof before
> expenditure.  That is rational business thinking.
> Given that, Blueberry probably will pass.
> How did XML get this far without support for what
> you seem to say is fundamental Japanese?

(I speak for myself. Mr. Murata may have more in-depth experience.)

(1) We mostly aren't using UNICODE. We use the JIS character sets and rely
on the round-tripping, not worrying too much whether the intermediate
UNICODE version contains the characters we wanted or not. (From what I
understand, they haven't yet achieved full confidence in the round-tripping,
either, but using it is the only real way to debug it.)

(2) We/they basically are using what is available and hunting for ways to
make the rest work.

(3) UNICODE actually contains more characters than JIS. The Japanese
Industrial Standards group on characters is currently bringing JIS up to
level with UNICODE, but the fonts on machines when you buy them are still
usually limited to the original 6000+.

So we cannot _today_ really work with most of the characters that are
causing concerns in UNICODE. Most of these characters are presently handled
in the private use area when they are needed, which means that they can't
yet be used in tags if we wanted them to be. So we give up on them for the
time being.

I suspect that the common characters which are supposed to cause problems
are being mis-handled by the existing translation tables, leaving them
useable for tags until the tables get fixed.

(4) Neither UNICODE, nor the JIS character sets, deal directly with the
fundamental stuff. The JIS character sets were not originally intended to be
more than a stop-gap convenience for the printing industry, fitted to the
processing limitations of the day. They really wanted a solution that would
let them compose characters on the fly, just like English speakers can make
up new words on the <hint>fligh</hint>. But UNICODE just uses the best
current practice.

> Not to
> point fingers but to begin to look at the problems
> of XML versions and just how big one should be, we
> should understand if there was a flaw in the way
> the requirements were created.

The flaw, of course, was building straight on top of UNICODE. UNICODE, as
great as it is, kind of misses a point where Japanese is concerned.

Joel Rees

> Len
> http://www.mp3.com/LenBullard
> Ekam sat.h, Vipraah bahudhaa vadanti.
> Daamyata. Datta. Dayadhvam.h
> -----Original Message-----
> From: Murata Makoto [mailto:mura034@attglobal.net]
> Elliotte Rusty Harold wrote:
> > That argument was very unconvincing. He explained why the Japanese
> >support XML already has is quite important to Japanese users. But we
> >already have that. Nobody's arguing that we take it out. He said that
> >Microsoft should improve its tools to better support XML 1.0. Lord knows
> >I agree with that.
> >
> > He indicated one character from Unicode 2.0 and earlier that would
> >clearly be useful to Japanese users as a name character, the Katakana
> >middle dot. I've wondered before why that one got left out of name
> characters
> >in XML. It's worth fixing if we do revise XML, but by itself it doesn't
> >seem important enough to justify revising XML. Nothing he said was
> >relevant to the question of whether the additional characters in
> >Unicode 3.1 are necessary for Japanese users.
> I have pointed out that at least six characters in Extension B are
> more important than their variants in Unicode 2.0.
> U+23372
> U+26402
> U+23647
> U+2A602
> U+28A99
> U+28EEB
> In my previous mail, I have given a list of made-in-Japan kanjis which
> have been missing until Unicode 3.1.
> Do other people need more information?  Frankly, I do not want to be told
> not to use Japanese by Elliotte.