>I had a recent argument/discussion with a co-worker about if it
>is accurate (or useful) to consider UTF8 "Text" ...
>
>From the perspective of the archaic (but still implemented)
>"text mode" file open modifier ... valid UTF8 "Text" was not
>readable because the control-Z was interpreted as EOF and data
>would be truncated.
>
>Yet Notepad could read these "Text Files" just fine.
Of course it can, because there is no such problem. Whoever
thought there was never looked at the UTF-8 encoding rules.
*All* ANSI characters below 0x80 are encoded as themselves;
this includes all control characters. The UTF-8 bytes all
have the top bit set, so as ASCII would appear to be in the
range 0x80-0xFF, *never* below 0x80.
--
Jeremy H. Griffith <
jhgriffith@gmail.com>
http://udoc2go.com