OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

Mike Champion wrote:

> Tim's approach is taking a real, widespread problem
> and offering a clean, layered solution -- essentially
> a character encoding preprocessor -- rather than
> changing XML itself.

Exactly. It's a syntax macro (just like namespaces). So the 
*problem* of resolving entities hasn't gone away, specifically that 
  problem is one of getting your entities from one place to another 
and back again. Unless of course we all upgrade to use +names.

I'd love to know what SOAP-oriented folks think about it.

> Actually, a similar idea came up at the Binary Infoset
> workshop, to leverage/exploit the fact that the XML
> spec allows an open-ended set of encodings. This
> allows experimentation WITHOUT "corrupting" the core
> spec with support for local languages, stuff of
> interest mainly to mainframes, or more efficiently
> transmittable and/or lexable serializations.  

It also potentially hurts interop - there are costs and benefits to 
be weighed up.

> IMHO, it extends the Unicode encoding layer upwards to
> remove a wart in XML, not vice-versa.

I disgree :) It sems to be solving a wart (or a gap in the market) 
of Unicode transformations, not XML. Otherwise why would it be 
useful outside XML?

> Anyway, I think this is a great idea, and I
> congratulate Tim for working it out and moving it
> forward.  

I like it at first glance, but the current draft is too vague. I 
suspect the impact of this encoding is more than Tim is giving 
credit for - so I'm not buying arguments from idiotic simplicity 
just yet.

I'd like to:

  o see the encoding name changed to "UTF-8+entities", the current 
name is rather vague.

  o see examples of escaped whitespace.

  o know whether <w&oacute;oops/> is a legal element name in this 

  o hear a rationale, other than use outside XML, for choosing a new 
encoding to solve this problem, ie why not xml:entities="yes" or 
some other approach?

  o know whether the current MathML/HTML4 sets are sufficient; ie 
are we going to need to reversion this in couple of years to cater 
for ogham?

  o like elharo and Alessandro, I'm unconvinced about treatment of 
&, specifically, that it isn't being overloaded in some 
clever/sneaky way. Indeed, I'll claim that & /is/ being overloaded 
in some clever/sneaky until the next draft shows me otherwise.

Bill de hÓra
Technical Architect


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS