[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
- From: Jonathan Borden <email@example.com>
- To: James Clark <firstname.lastname@example.org>, Tim Bray <email@example.com>,firstname.lastname@example.org
- Date: Tue, 10 Jul 2001 23:40:42 -0400
> > Realistically, there are 3 options:
> > 1. Leave it the way it is.
> > 2. Do Blueberry and then repeat the process for Unicode 3.2
> > and 4.0 and so on every couple of years forever.
> > 3. Bite the bullet, write the rules in terms of Unicode
> > metadata and go to a pure use-by-reference architecture,
> > probably adding a syntactic signal to reference the
> > Unicode version number.
> I don't find any of these options very appealing.
> Another bullet one could bite is to no longer make checking of name
> characters (beyond what is needed to prevent ambiguity) a part of
> well-formedness. Whilst it's nice to have some sanity checking of names,
> using inappropriate characters in names doesn't cause problems for further
> processing layers to the same extent as other things that are part of
> well-formedness do, such as unbalanced tags or duplicate attributes.
> At least I think one should consider easing draconian error
> handling for bad
> name characters to reduce deployment problems with option 2.
Perhaps I might paraphrase this by suggesting that we define was is not
allowed in a name rather than what is. At the very least the set of
characters not allowed in a name are those needed to prevent ambiguity
(whitespace,">,)|=*+"). Consider the element:
<O'Hara> shrug </O'Hara>
Well, all my current documents would remain well-formed so I don't see how
allowing this would adversely affect me. I could always refuse to read such
new fangled documents just as I refuse to read HTML email :-)