OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Blueberry/Unicode/XML

Elliotte Rusty Harold wrote:
> At 7:27 PM -0400 7/11/01, HUGHES,MARK (Non-HP-FtCollins,ex1) wrote:
> >   Right.  If XML is to be modified for this, it should be done once
> >and only once.  Allowing all but syntax-specific characters is the
> >simplest and best way to do that, with the minimum amount of argument
> >over what is and is not a "reasonable" character to put in a name.
> >
> There's a lot of nasty stuff hiding in the bowels of Unicode that
> will cause problems if it isn't specifically considered and
> rejected. For example, would really want to allow an em space as
> an element name? How about the zero-width non-joiner? or the
> right-to-left mark? There are good reasons to limit XML names to
> Unicode name characters.

Aside for perhaps arbitrary (perhaps not :-) decisions about what characters
ought or ought not be used to name things, what are these "good reasons"?

I specifically include in "good reasons":

1) useful pieces of code that would break
2) hindrances to the development of useful pieces of code

I am not limiting the list to these two, but I would like to develop a
practical way of deciding these very important issues. Clearly any way this
is decided, tradeoffs are to be made, and I want to give strong weight to
practical consequences -- just to be clear, I place a high value on the
ability of humans to read XML, including its markup.

But honestly I am hardly a unicode expert, its just that my perhaps naive
impression is that given whatever nastly confusing problems that might occur
using weird unicode characters in names, could as easily be replicated using
nasty confusing -- yet well-formed -- names in XML as it stands today.
Please educate me otherwise (i.e. this is just my impression).