RE: There is a serious amount of character encoding conversionsoccurring

-----------

Some mighty smart fellows figured this character encoding stuff out long ago and now it is buried so deep in the fabric of our computers and the Web that we are completely oblivious to all the encoding conversions that are happening.

/Roger

----------

We are only completely oblivious when it works.

Which is really rare. I am amazed you had such success.

Try something interesting in your tests. Try a unicode charactor outside the 0xFFFF codepoint range.

Like this one: dec 110593 hex 1B001 HTML 𛀁

http://rishida.net/scripts/uniview/?codepoints=1B001

To be able to track end to end the path of conversions and validate that your application from authoring through to storage through to search and retrieval is completely correct is amazingly difficult. IMHO its a skill far too few programmers have, or even recognize that they do not have.

----------------------------------------

David A. Lee

dlee@calldei.com

http://www.xmlsh.org