OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML Blueberry (non-ASCII name characters in Japan)

John Cowan commented:

> Joel Rees wrote:
> > I have been assuming that non-name characters are not allowed in tag
> > attributes.
> If by "attributes" you mean attribute names, then you are correct.
> If by "attributes" you mean attribute values, then you are incorrect
> as to general attributes, but still correct in the case of attributes
> with declared types other than CDATA.

Well, I was thinking, like a gaijin again (Huh?), that the extension is all
rarities, but Mr. Murata points out that it includes corrections for some
arguable mistakes from the Japanese standards, as well as some corrections
for arguable mistakes from UNICODEs original unification. So, even if IDs,
etc., didn't matter so much, tags and attribute names may matter after all.

So the question goes back to how much damage is being taken on tag and
attribute names, IDs, IDREFs, etc. What about enumerated types? It looks
like we'll have restrictions there, too.

I don't know how much of a loss it incurs, and our Japanese friends tend to
beat around the bush and gloss over details in the assumption that
underlings will work out the details. Oh, Americans do that too, but only
when we don't have much to lose. Hopefully, Mr. Murata can provide more

> > Oh. I thought of another way around this issue. It is not presently a
> > satisfying solution, but may be the ultimate solution, if it would work:
> > ideographic sequences allowed in markup (tags and attributes)?
> No. As another poster noted, they are not unambiguous, and really are
> *descriptions* of ideographs rather than *constructors*, more analogous
> to a phrase like "'e' written backward with a dot underneath".

Sloppy thinking on my part. The idea of having two standard encodings for
the ideographs, a standard ideographic sequence for each code point,
intrigues me. It doesn't solve the rendering problem for rare and
non-standard characters, but I think it might help solve the parsing
problem, and therefore help with effective use of the private-use area

> > To be a tree supporting all information,
> >   giving root to the chaos
> >     and branches to the trivia,
> >       information breathing anew --
> >         This is the aim of Yggdrasill.
> Fine, fine.  But is there a one-eyed man hanging from the branches?

Yggdrasill is the name here in Japan of our brand new native XML database
product. It is not yet officially announced internationally. We'll be
demonstrating it at Extreme Markup.

I suppose I could claim to be Odin, but no one would believe me.

> --
> There is / one art             || John Cowan <jcowan@reutershealth.com>
> no more / no less              || http://www.reutershealth.com
> to do / all things             || http://www.ccil.org/~cowan
> with art- / lessness           \\ -- Piet Hein

Joel Rees
programmer -- rees@mediafusion.co.jp
============================XML as Best Solution===
Media Fusion Co. ,Ltd.  株式会社メディアフュージョン
Amagasaki  TEL 81-6-6415-2560    FAX 81-6-6415-2556
    Tokyo TEL 81-3-3516-2566   FAX 81-3-3516-2567