OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Specifying a Unicode subset

[ Lists Home | Date Index | Thread Index ]

No, my point is that XML should have nothing at all to say about 
character encodings other than to specify an already standard character 
encoding (preferably one of  the unicode representations).

Character encoding is a huge hassle for programmers to deal with.  And 
if history is any teacher, we can see that they simply don't deal with 
it at all unless forced.

Why do something different from the base unicode standard wrt encodings 
and why impose additional burdens of filtering out certain characters?

I'll ask again - what exactly is this meant to achieve?  Because its 
hard enough right now to cope with all the various text representations 
out there.  There's no reason to make more.

On Tuesday, October 22, 2002, at 12:12  PM, Gustaf Liljegren wrote:

> tblanchard@mac.com wrote:
>> Unicode has arrived to kill off all of the short sighted legacy
>> character encodings and while unicode has a *lot* of problems for 
>> asian
>> languages (Han unification was *NOT* a good idea), it remains
>> infinitely better than the tower of Babel we had before.
> I think you definition of the word 'encoding' is a little too broad. 
> This
> is not about encodings. It's about giving users the freedom to choose 
> which
> parts of Unicode they want. How to specify the method to convert 
> character
> numbers to binary form is another problem, and I'm happy with the 
> solution
> XML provides.
> This way, those who need to use characters in the intervals forbidden 
> in
> XML 1.0 would have the freedom to use them, while the rest of us are 
> left
> unaffected.
> If I'd decide, there would be no change in XML. But if a new version is
> unavoidable and I need to pick one, I'd rather go for a more flexible
> solution, because I fear that 1.1 won't be the last version of its 
> kind.
> I admit there may be better ways than using a PI. Maybe the information
> about legal characters is better to specify at the schema level rather 
> than
> for each document, but these things are just details. The main point is
> that specifying which characters are legal in the specification itself 
> may
> be too limiting. Perhaps the only real requirement should be that the
> character numbers have the meaning specified in Unicode?
> Gustaf
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS