XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: There is a serious amount of character encoding conversionsoccurring inside our computers and on the Web

-----------

Some mighty smart fellows figured this character encoding stuff out long ago and now it is buried so deep in the fabric of our computers and the Web that we are completely oblivious to all the encoding conversions that are happening.

 

/Roger

----------

 

We are only completely oblivious when it works.

Which is really rare. I am amazed you had such success.

 

 

Try something interesting in your tests.  Try a unicode charactor outside the 0xFFFF codepoint range.

Like this one:    dec 110593    hex  1B001  HTML 𛀁 

http://rishida.net/rishida/c/Kana%20Supplement/1B001.png 

http://rishida.net/scripts/uniview/?codepoints=1B001

 

 

To be able to track end to end the path of conversions and validate that your application from authoring through to storage through to search and retrieval is completely correct is amazingly difficult.   IMHO its a skill far too few programmers have, or even recognize that they do not have.

 

 

----------------------------------------

David A. Lee

dlee@calldei.com

http://www.xmlsh.org

 

 

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS