xml-dev - RE: [xml-dev] Question about UTF-8

RE: [xml-dev] Question about UTF-8

[ Lists Home | Date Index | Thread Index ]

To: "'Tim Bray'" <tbray@textuality.com>, Gustaf Liljegren <gustaf.liljegren@xml.se>
Subject: RE: [xml-dev] Question about UTF-8
From: Rob McDougall <Rob.McDougall@adobe.com>
Date: Fri, 29 Aug 2003 11:24:59 -0400
Cc: xml-dev@lists.xml.org

I think Tim is being unintentionally misleading.  If you stick to ISO 8859-1
then all the European characters will look fine in an editor and you can
still include non-European characters using numeric character references.

You don't need to count on never seeing non-European characters in the data,
you just won't be able to see them as glyphs in an ISO 8859-1 encoding and
they will take up a lot more space if they do appear.  You should feel
comfortable that you won't see many non-European characters in your data
before choosing ISO 8859-1 as your encoding.

Regards,
Rob
________________________________________ 
Rob McDougall
Sr. Computer Scientist
Adobe Systems Incorporated
Phone: +1 613.940.3708
Fax: +1 613.594.8886


-----Original Message-----
From: Tim Bray [mailto:tbray@textuality.com] 
Sent: Thursday, August 28, 2003 1:02 PM
To: Gustaf Liljegren
Cc: xml-dev@lists.xml.org
Subject: Re: [xml-dev] Question about UTF-8

[snip]

> Many users who see 'Ã¤' when they open a UTF-8 encoded XML document in a
> text editor, prefer to use ISO 8859-1 to avoid this effect.

That only works until you need to use a character that isn't in 8859-1, 
such as those used by about two thirds of the world's population.

> Maybe the answer is to stay in ISO 8859-1 (or whatever default encoding
the
> editor has), but I was hoping it was possible to recommend using UTF-8 all
> the time (for European scripts).

The notion that you can count on never seeing non-European characters is 
a recipe for disaster in today's world.  Good solutions are: (a) as you 
suggest, use UTF-8 all the time, or (b) use XML for interchange.

-- 
Cheers, Tim Bray
         (ongoing fragmented essay: http://www.tbray.org/ongoing/)

Follow-Ups:
- Re: [xml-dev] Question about UTF-8
  - From: Tim Bray <tbray@textuality.com>

Prev by Date: RE: [xml-dev] XML storage
Next by Date: Re: [xml-dev] Question about UTF-8
Previous by thread: RE: [xml-dev] [OT] Who said the browser wasn't dead?
Next by thread: Re: [xml-dev] Question about UTF-8
Index(es):
- Date
- Thread