OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Specifying a Unicode subset

[ Lists Home | Date Index | Thread Index ]

Gustaf Liljegren scripsit:

> This way, those who need to use characters in the intervals forbidden in
> XML 1.0 would have the freedom to use them, while the rest of us are left
> unaffected.

The 65 ISO controls that are forbidden are (all but one) representable
as character references.  The point of forbidding them is to improve
character-encoding detection in a world where most documents are
not Unicode-encoded.  For example, because U+0080 is forbidden, a
Windows-1252 document mislabeled as Latin-1 will cough on the Euro
sign, because it will incorrectly be mapped to U+0080 instead of the
correct U+20AC.  This is a Good Thing.

> If I'd decide, there would be no change in XML. But if a new version is
> unavoidable and I need to pick one, I'd rather go for a more flexible
> solution, because I fear that 1.1 won't be the last version of its kind.

I believe that 1.1 will be the last release related purely to characters.
If there is an XML 2.0, it will be about entirely different issues.

Some people open all the Windows;       John Cowan
wise wives welcome the spring           jcowan@reutershealth.com
by moving the Unix.                     http://www.reutershealth.com
  --ad for Unix Book Units (U.K.)       http://www.ccil.org/~cowan
        (see http://cm.bell-labs.com/cm/cs/who/dmr/unix3image.gif)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS