OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] UTF-8+names

[ Lists Home | Date Index | Thread Index ]

Asking for requirements is always a good idea.

I think users want not to lose the following when they use non-DTD

- Internal entities for common well-known entity sets, like those of XHTML,
MathML, etc.
- Internal entities for user-defined shorthand
- External parsed entities (includes)

If internal entities with simple definitions (i.e., no use of parameter
entities) were the sole requirement and users were willing to have all the
entities used in a document defined in the internal subset of every
document, then you are right. An editor could easily insert those

Bob Foster

From: "Alessandro Triglia" <sandro@mclink.it>
To: "'Bob Foster'" <bob@objfac.com>; "'Tim Bray'" <tbray@textuality.com>;
"'Miles Sabin'" <miles@milessabin.com>
Cc: <xml-dev@lists.xml.org>
Sent: Sunday, October 19, 2003 6:17 PM
Subject: RE: [xml-dev] UTF-8+names

I have another comment.

What is that those users have actually been asking for?  What is their
actual need?  Do they want to be able to display and/or enter a rare
character when using a user interface that doesn't support that character

If so, isn't this entirely a software issue?  Can't existing XML browsers
and editors just be extended so as to support *names* for characters, and we
leave the encodings alone?

For example, if I want to enter a ?   (the cyrillic character) using a
keyboard that does not support cyrillic, I can currently use some
OS-specific means (say, the character map applet in Windows plus a
copy/paste).    If an XML editor had the inherent ability to accept any
Unicode character by opening a dialog box showing a list of Unicode names,
that would be sufficient for many purposes.  Likewise, if an XML viewer had
the ability to display the Unicode name of a rare Unicode character when the
cursor is above a character, that could be sufficient for many purposes.
If some program needs to cope with display hardware that doesn't know how to
display a  ? , the software itself can be written so as to show the Unicode
name of the ?  (CYRILLIC CAPITAL LETTER SHCHA) or some shorter local
designation, instead of a small square.

Recalling one of the cases mentioned, can't an XML editor (as a
product-specific feature) allow the user to enter  something like   &nbsp;
and change it on the fly to the Unicode character   NON-BREAK SPACE
depending on the context?  Can't this XML editor subsequently display a tool
tip over the character?   Do we really need to  *encode*  the NON-BREAK
SPACE  as a byte sequence  & n b s p ;   ?

What fraction of those use cases would be left out, if the issue were
regarded as a software issue?



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS