[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Correct xml:lang value for Pinyin Chinese vs Simplified Chinese
- From: Rick Jelliffe <rjelliffe@allette.com.au>
- To: Lech Rzedzicki <xchaotic@gmail.com>
- Date: Tue, 28 Feb 2012 03:25:34 +1100
Are you sure you have the right terms here? Pinyin is not pidgen. And
it usually has no accents. (If it has accents, in particular macrons,
it may not be standard Pinyin, which is not to say that it might not
be an old or extended Pinyin.)
Language codes are in flux: the three letter codes and the two letter
codes have different approaches. The two letter codes plus regional
variant may still be safest. So first you need to determine the
region: is your simplified text from PRC or Singapore?
Assuming it is from PRC, then the language code zh-CN should be enough AFAIK.
Note that there is (or should be) no need to specify anything about
the script if you are just marking up existing text. @xml:lang
specifies the language, and the script only indirectly because a
language+region often has a standard or characteristic orthography:
the general script being used is obvious from the characters
themselves.
So you could use xml:lang="zh-CN" for all the three cases you
mention. If you wanted to give more of a hint, you could try
xml:lang="zh-CN-pinyin" or "zh-Latn-CN-pinyin" for the standard
pinyin, and xml:lang="zh-CN-pinyin-adhoc" or "zh-Latn-CN-adhoc" for
the non-standard one (where "adhoc" is some phrase you pick to
indicate an extended pinyin or mystery format.)
(I suspect the transliterated Chinese with accented roman characters
would not be a legitimate zh-Latn-CN (I'd expect John Cowen to be on
top of this) but if it were, then that would probably be the best for
the non-standard transliteration )
If you want to mark up your text so that screen readers can read it,
then find the website for the screen reader, contact the developers,
and ask them. I doubt if the non-standard pinyin would have specialist
readers that can understand it in any case (though IIRC there was a
reader that understood 1,2,3,... tone digits in with pinyin or
bopomofo.)
For more info, see
http://www.alvestrand.no/pipermail/ietf-languages/2008-September/008322.html
You could track down the current IANA registrations for
http://www.ietf.org/rfc/rfc4646.txt too, I guess.
Cheers
Rick Jelliffe
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]