[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] I understand "codepoints" ... hurrah!
- From: David Carlisle <davidc@nag.co.uk>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Tue, 27 Nov 2012 14:00:23 +0000
On 27/11/2012 13:42, Costello, Roger L. wrote:
> Recall that an encoding is a sequence of zeroes and ones.
> ...
> Through simple arithmetic a binary number can be converted to a
> decimal or a hexadecimal value. So, rather than referring to an
> encoding by its sequence of bits or by a binary number, it can be
> referred to by a decimal or hexadecimal value
I think you mixed (several) levels there:
Unicode code points are _numbers_ like 1 or 917999 they are not bit
patterns in computer memory.
The Unicode character VARIATION SELECTOR-256 has codepoint 917999
irrespective of how the character or that number are stored on a computer.
There are (several) ways of storing numbers on digital computers and in
particular there are several encodings of unicode code points as byte
sequences (utf-8 UTF-16LE UTF-16BE etc) the in-memory bit patterns
depend on the encoding being used. The codepoint does not.
Also and only slightly related there are of course alternative character
sets and encodings such as latin 1 and iso 8859-1 respectively
but the mapping between iso 8859-1 and Unicode is conceptually a rather
different beast than the UTF encodings of Unicode as byte sequences.
even though both use the word "encoding". It's a general feature of this
area that the terms "encoding" "character" "character set" are
incompatibly defined at almost every level so you usually end up having
to define terms afresh in each document there is no globally applicable
definition.
David
________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]