[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML Blueberry (non-ASCII name characters in Japan)

From: Rick Jelliffe <ricko@allette.com.au>
To: xml-dev@lists.xml.org
Date: Tue, 10 Jul 2001 19:16:00 +0800

 From: "Thomas B. Passin" <tpassin@home.com>
 > So, you CJK-obscure-coding unicode experts out there, what's the betting
on
> how the characters will get into people's text-producing programs?  WIll
> people be typing these new characters into documents with abandon?

Same as now. If someone writes a Spanish n with a tilde in there DTD, and
you editor is an ASCII editor, it cannot edit it. If you are lucky it will
preserve it. If you are unlucky it will corrupt it.

There are no numeric character references in names.  So a name is always
readable in a text editor which accepts the encoding; there are never any
references which need to be dereferenced.

Of course, if I wanted to make an obscure DTD, I could use Greek (if you
cannot read Greek) or some cartoonish mix of characters.  But then it is
obvious.

Restricting names to letters and other symbols that are typically used for
pronouncable, readable words in each language is not only good for catching
transcoding errors (important in some places) and to allow easier use of the
names as object names in scripts (where you don't want them to start with a
digit), but very importantly it acts against people making random (i.e.
private/proprietary) names in their DTDs as a way to capture users.  They
can still do it, of course, but they cannot pretend "oh, we didn't know a
name should be readable so we just used UUIDs for all our names", batting
their eyelids.

Cheers
Rick Jelliffe

References:
- RE: XML Blueberry (non-ASCII name characters in Japan)
  - From: "Bullard, Claude L (Len)" <clbullar@ingr.com>
- Re: XML Blueberry (non-ASCII name characters in Japan)
  - From: John Cowan <jcowan@reutershealth.com>
- Re: XML Blueberry (non-ASCII name characters in Japan)
  - From: "Thomas B. Passin" <tpassin@home.com>

Prev by Date: Re: Accepting non-deterministic content models
Next by Date: Re: Blueberry/Unicode/XML
Previous by thread: Inputting eastern ideographs (was Re: XML Blueberry, etc.)
Next by thread: RE: XML Blueberry (non-ASCII name characters in Japan)
Index(es):
- Date
- Thread