OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Binary versus Text

On Sun, 24 Nov 2013 15:49:24 +0000, David Lee <dlee@calldei.com> wrote:

>I had a recent argument/discussion with a co-worker about if it 
>is accurate (or useful) to consider UTF8 "Text" ...
>From the perspective of the archaic (but still implemented) 
>"text mode" file open modifier ... valid UTF8 "Text" was not 
>readable because the control-Z was interpreted as EOF and data 
>would be truncated.
>Yet Notepad could read these "Text Files" just fine.

Of course it can, because there is no such problem.  Whoever
thought there was never looked at the UTF-8 encoding rules.
*All* ANSI characters below 0x80 are encoded as themselves;
this includes all control characters.  The UTF-8 bytes all
have the top bit set, so as ASCII would appear to be in the
range 0x80-0xFF, *never* below 0x80.

Jeremy H. Griffith <jhgriffith@gmail.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS