[
Lists Home |
Date Index |
Thread Index
]
> -----Original Message-----
> From: Seairth Jacobs [mailto:seairth@seairth.com]
> Sent: Saturday, October 18, 2003 15:09
> To: xml-dev
> Subject: Re: [xml-dev] UTF-8+names
>
>
> From: "Tim Bray" <tbray@textuality.com>
> >
> > Check out http://tbray.org/tag/utf-8+names.html
>
> Instead of throwing an error for a missing semicolon, why not
> let that case fall under section 4. Since the set of
> pre-defined reference names are known, you also know the
> maximum length possible. If you reach length+1 characters,
> you know it's not a reference and leave it as-is. It also
> means that a lone ampersand won't be an error. I suspect the
> motivation to require the semicolon is for consistency with
> XML's same requirements. However, this disallows some valid
> UTF-8 from being valid UTF-8+names as well. While this may
> be fine if used solely within the context of XML (which would
> throw an error as soon as it hit the invalid reference
> anyhow), I suspect people will try using this outside of XML as well.
>
> Also, I suspect that using the same format as XML/SGML's
> references is going to confuse people. Maybe use a similary
> format such as #name; or @name;. This way, at least the two
> references (UTF-8+name and XML) would not be confused.
I think there are other problems.
As I understand, in UTF-8+name, an ampersand is represented as &&; which
means that, if UTF-8+name is used for XML, "normal" entity references will
look like:
&&;myentity;
and numeric character references will look like:
&&;#12345;
which is ugly.
In addition, "UTF-8+name entities" would have the usual syntax:
<
but this can be confusing because it would denote a **literal** < character,
not one obtained by including the entity. As a consequence, I would *not*
be allowed to use < in the value of an attribute, for example, but
would have to use &&;lt; for the same purpose.
I think that these problems can be overcome by using something different
from the ampersand, as you suggest.
As for the final ; being mandatory, I believe it should be, for
robustness.
It is not very clear to me where UTF-8+name would be useful, as I don't
think it is useful in XML. Is it being proposed for use in areas where, for
some reason, XML cannot be used?
Alessandro Triglia
OSS Nokalva
>
> ---
> Seairth Jacobs (seairth@seairth.com)
> Looking: http://www.seairth.com/blog/65
>
>
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org
> <http://www.xml.org>, an initiative of OASIS
<http://www.oasis-open.org>
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>
|