[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Did you know that the lowercase of the Kelvin Sign(K) is the Latin small letter k? Do you know the impact of that?
- From: Michael Kay <mike@saxonica.com>
- To: xml-dev@lists.xml.org
- Date: Mon, 28 Jan 2013 22:45:10 +0000
On 28/01/2013 19:21, Costello, Roger L. wrote:
> Hi Folks,
>
> The Kelvin Sign (K) is high up in the Unicode code space, it is codepoint U+212A. That's way up there.
>
> Compare with the Latin capital letter K, its codepoint is U+004B. That's way down there.
>
> Interestingly, the lowercase of the Kelvin Sign is the Latin small letter k:
>
> lower-case(K) = 'k'
>
> "So what's the big deal?" you ask. Actually, it's a really big deal. Let me explain.
>
> Suppose you want to enforce this rule in your XML instance documents:
>
> The value of the <Name> element must
> be 'Lockhart' (lowercase, uppercase, any
> case).
I don't know what you mean by your rule. What are your rules for
case-blind equivalence, if they aren't the Unicode rule? Do you know
better than the collective wisdom of the Unicode consortium what
lowercase and uppercase mean?. If you don't like the Unicode rules for
this, you must propose and justify an alternative.
> Question: Are there other characters similar to the Kelvin Sign? That is, are there other characters that are outside [A-Za-z] but when lower-case() or upper-case() is applied to them they are inside [A-Za-z]?
>
Not very many, as it happens. The only examples I found in Unicode 4.0.0
(I haven't checked later versions) are:
<char code="0130" name="LATIN CAPITAL LETTER I WITH DOT ABOVE"/>
<char code="0131" name="LATIN SMALL LETTER DOTLESS I"/>
<char code="017F" name="LATIN SMALL LETTER LONG S"/>
<char code="212A" name="KELVIN SIGN"/>
The dotted/dotless I problem affects Turkish in particular, where
dotless small i is the normal lower-case counterpart to dotless capital
I. The "long S" is of course simply an archaic variant of the modern "s".
Michael Kay
Saxonica
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]