XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: There is a serious amount of character encoding conversionsoccurring inside our computers and on the Web

David Lee wrote:

 

    To be able to track end to end the path of conversions

    and validate that your application from authoring

    through to storage through to search and retrieval is

    completely correct is amazingly difficult.   IMHO it’s a

    skill far too few programmers have, or even recognize

    that they do not have.

 

Fascinating!

 

You are writing about character encoding conversions as text moves from point to point to point.

 

Is there a parallel with markup? Are there markup conversions as XML moves from point to point to point?

 

Are there lessons learned in the character encoding community that could be applied to the XML community?

 

/Roger

 

From: David Lee [mailto:dlee@calldei.com]
Sent: Friday, December 28, 2012 9:36 AM
To: Costello, Roger L.; xml-dev@lists.xml.org
Subject: RE: There is a serious amount of character encoding conversions occurring inside our computers and on the Web

 

-----------

Some mighty smart fellows figured this character encoding stuff out long ago and now it is buried so deep in the fabric of our computers and the Web that we are completely oblivious to all the encoding conversions that are happening.

 

/Roger

----------

 

We are only completely oblivious when it works.

Which is really rare. I am amazed you had such success.

 

 

Try something interesting in your tests.  Try a unicode charactor outside the 0xFFFF codepoint range.

Like this one:    dec 110593    hex  1B001  HTML 𛀁 

http://rishida.net/rishida/c/Kana%20Supplement/1B001.png 

http://rishida.net/scripts/uniview/?codepoints=1B001

 

 

To be able to track end to end the path of conversions and validate that your application from authoring through to storage through to search and retrieval is completely correct is amazingly difficult.   IMHO its a skill far too few programmers have, or even recognize that they do not have.

 

 

----------------------------------------

David A. Lee

dlee@calldei.com

http://www.xmlsh.org

 

 

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS