OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Language declaration question



Thanks for the reply.

> -----Original Message-----
> From: Christopher R. Maden [mailto:crism@maden.org]
> At 04:19 31-08-2001, Hewko, Doug wrote:
> >Out of curiousity, when you specify an encoding value in a 
> XML document,
> >where does the XML processor obtain the encoding values 
> from? Would the
> >processor return an error, automatically download the 
> encoding value, or
> >does it come already with every known encoding value? (I 
> have difficulty
> >believing the latter.)
> >
> >For example:
> >
> >1) I am programming XML for your MS IE 5.5 browser (assume 
> it supports the
> >ISO-8859-6 values). I happen to specify the an Arabic 
> language coding,
> >ISO-8859-6. What would someone get if they have the basic 
> American version
> >of Windows?
> 
> You're conflating two issues: encoding and language.
> 
> To support ISO-8859-6, MSXML needs to know how the encoded 
> characters in 
> 8859-6 map into Unicode.  This is easily done with a mapping 
> table.  *Displaying* Arabic characters is something 
> different, and requires 
> that the necessary fonts and support be installed.  The parser has it 
> pretty easy here.

When I looked at a chart that has the encoding values available (ie. "UTF-8:
Compressed Unicode", "ISO-8859-2: Latin-2; Eastern European", "EUC-JP:
Japanese, Unix", etc), they all imply some language. UT-8 is primarily the
English characters.) I thought they would be synonymous with "encoding" just
being the language that the document was typed in. That is why I got
confused.

Just to make sure I understand, all encoding does is translate the
machine-coded values using a table into a standard "master" language that
the processor can understand? Does all processors use Unicode? (ie. Would a
Chinese version of MS IE 5.5 use the same Unicode that I would use?)