OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

Bill de hÓra wrote:

> Even so, I'd like some words around whitespace - it's great for
> thrashing out details. And I was thinking of cases like
>   <foo& ;bar="baz"& ;/>

This clearly not well-formed XML and is the same in UTF-8 and +names. 
The issue is whether +names needs to declare any additional rules about 
what can come between the & and ;.  At the moment I think not; since 
there's an exhaustive list of replacements, and a statement that 
anything else is just passed through, I think that covers it.

>>> know whether <w&oacute;oops/> is a legal element name in this proposal. 

>> Yes.  To XML, it's an empty-tag whose name contains six characters,
>> all of them legal in names.
> *Definitely* mention this in the next draft.

Yes, I guess there needs to be a "considerations for use in XML" section.

> Then I see an issue with +names. Once the prolog is lost I'm
> hosed. More and more XML is being shipped around inside envelopes,
> which are probably UTF-8 or iso-latin. Have we lost some
> self-description in this encoding, by it doubling up as a macro and
> an encoding (although that's a somewhat fuzzy distinction)?

There is a well-known problem with XML, namely XML documents tend to not 
nest nicely inside other XML documents.  I think we're stuck with that 
one.  It's totally the case that if you're going to use something like 
+names, you own the responsibility for ensuring that anyone to whom you 
send gets told firmly what it is.  The encoding declaration helps a lot, 
but there are going to be cases where breakage will occur unless you put 
in some extra effort.

> Ok. You could start by addressing Elliote's, Alessandro's and Mike's
> posts on the matter. Then use the answers to make the draft clearer

I'll rev the draft one more time, but that's about it.  As I've said, 
this is a trial balloon. There are substantial communities - remarkably 
unrepresented here on xml-dev - who have been complaining vociferously 
because neither W3C XML Schemas nor RelaxNG shows any sign of addressing 
the entity problem, and they claim they really need them, mostly for 
this problem of naming characters.  If nothing else, +names is a thought 
experiment which should clarify these peoples' issues.  So far we've 
heard a few voices saying "this might be useful" and a few saying "This 
is an abomination". So far we haven't really heard very much from the 
constituencies whose issues this is designed to address.   If this 
silence continues, the conclusion will be obvious.
Cheers, Tim Bray (http://www.tbray.org/ongoing/)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS