OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Blueberry/Unicode/XML

> > Realistically, there are 3 options:
> >
> > 1. Leave it the way it is.
> > 2. Do Blueberry and then repeat the process for Unicode 3.2
> >    and 4.0 and so on every couple of years forever.
> > 3. Bite the bullet, write the rules in terms of Unicode
> >    metadata and go to a pure use-by-reference architecture,
> >    probably adding a syntactic signal to reference the
> >    Unicode version number.
> I don't find any of these options very appealing.
> Another bullet one could bite is to no longer make checking of name
> characters (beyond what is needed to prevent ambiguity) a part of
> well-formedness.  Whilst it's nice to have some sanity checking of names,
> using inappropriate characters in names doesn't cause problems for further
> processing layers to the same extent as other things that are part of
> well-formedness do, such as unbalanced tags or duplicate attributes.
> At least I think one should consider easing draconian error
> handling for bad
> name characters to reduce deployment problems with option 2.

Perhaps I might paraphrase this by suggesting that we define was is not
allowed in a name rather than what is. At the very least the set of
characters not allowed in a name are those needed to prevent ambiguity
(whitespace,">,)|=*+"). Consider the element:

<O'Hara> shrug </O'Hara>

Well, all my current documents would remain well-formed so I don't see how
allowing this would adversely affect me. I could always refuse to read such
new fangled documents just as I refuse to read HTML email :-)