OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Question about UTF-8

[ Lists Home | Date Index | Thread Index ]

At 07:18 2003-08-28 -0700, Tim Bray wrote:

>> Is there a safe way for a non-XML-aware text editor to find out that this
>> file is using UTF-8?

>It is not ambiguous at all what the encoding must be; there's no 
>encoding declaration, so it must be UTF-8.

In an XML-aware editor, yes. But the question is about general
('non-XML-aware') text editors. A general editor has no idea of the
encoding detection mechanism in XML, so I wonder how it knows that the
octets C3 A4 should be written '' and not 'ä' (or something else).

Many users who see 'ä' when they open a UTF-8 encoded XML document in a
text editor, prefer to use ISO 8859-1 to avoid this effect.

Maybe the answer is to stay in ISO 8859-1 (or whatever default encoding the
editor has), but I was hoping it was possible to recommend using UTF-8 all
the time (for European scripts).



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS