Lists Home |
Date Index |
On Thursday 20 December 2001 11:04 am, Champion, Mike wrote:
> > Rick's point is that these are not text. The fact that they
> > are characters in Unicode is more an accident of history than
> > anything else, I think.
> Perhaps, but the job of determining what is and is not a "character,"
> sorting out the numerous issues surrounding platforms and human languages,
> and explaining decisions in an authoritative manner is the Unicode
> consortium's role in the world, not the W3C's.
While I agree that the Unicode consortium should be in charge of defining
what is, and what is not a character, I think it *certainly* is the W3C's,
or any other standards organization prerogative to define the use thereof.
That work has to be done knowledgably, and in recognition of the fact that
the Unicode consortium, like everyone else, is subject to compromise (take a
look at Korean, for example).
> I don't mean to disrespect the excellent work that went into XML 1.0,
> but in 20:20 hindsight it appears that the WG went a bit too far in
> second-guessing Unicode.
I don't think so. We had these debates in the WG, (speaking for myself)
knowing that at some point these things would have to be revisited. I should
note that in the WG were a number of experts in the field (in fact, a
significant portion of the XML WG were I18N/Unicode experts of one form or
another... certainly many were much more versed in I18N/Unicode than is
common). If there is any fault there, it is that we erred on the side of
interoperability (overly conservative). Almost all of the WG members were
also experts in XML/SGML markup of *documents*. A few people had some desire
to use XML for more than that.
I think the guiding principles behind most of the decisions made stand true
*so long as the desire for XML to be a text/document markup language* stands.
> It seems to me that XML 1.1 will be made simpler, more modular, and more
> widely useful outside SGML's traditional domain by layering the XML data
> model and serialization rules on top of the Unicode character model.
I don't think so... though I *will* say that I think a lot of the checking
(like name validation etc.) could/should be a validation constraint.
I think it's important to stick with the guidelines that produced XML that
emphasis human legibility and interoperability.