OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Blueberry/Unicode/XML

John Cowan wrote:
> Jonathan Borden scripsit:
> > Aside for perhaps arbitrary (perhaps not :-) decisions about
> what characters
> > ought or ought not be used to name things, what are these "good
> reasons"?
> >
> > I specifically include in "good reasons":
> >
> > 1) useful pieces of code that would break
> > 2) hindrances to the development of useful pieces of code
> The main point is that it wouldn't be plain text any more.  If
> XML is just a
> binary format, something that no human being ever looks at, then
> ASCII markup is plenty: you can tag everything x1, x2, x3, ....

Hmmm... I thought that Unicode _was_ plain text, at least it says that it
is. I am not suggesting that we not represent XML as a sequence of Unicode
characters, nor am I suggesting that we allow characters in element names
that are not allowed in text content.

> But there are many Unicode characters that are very similar to others,
> such as the halfwidth-fullwidth case that's been talked about already,
> or the 127 (:-)) kinds of stars, or the various kinds of whitespace
> that aren't, and so on.

I don't see the big difference between:

<shrug> O'Hara </shrug>


<O'Hara> shrug </O'Hara>

... if 127 kinds of stars pose a problem for humans reading element names,
surely they will pose the same problem for humans reading element content,