XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
RE: [xml-dev] Binary versus Text

> 

>Yet Notepad could read these "Text Files" just fine.

 

Of course it can, because there is no such problem.  

-----------

 

 

The problem is not with UTF8.

The problem is with the Windows "text mode" option to the posix emulated File open methods, such as fopen(), _open() etc.

This mode tries to help people write "posix compatible" code by filtering out window-isms and turning them into unix-isms.

One nice thing this does is convert CF/LF to LF .. .(oh  so HARD ...)(

One *horrid* thing it does is assume you're running like a DOS 1.0 filesystem where Control-Z was actually used as EOF

so when read() if used with this "text mode" in windows encounters a control-Z it returns -1 ... EOF.

 

When the filesystem upgraded some decades ago ( I dont know when exactly .... ) but by then there was a convention to actually

write Control-Z literally to text files to indicate the end of file.    Then there was a transient period where the filesystem itself didnt care about the control-z

but many text oriented programs started looking for control-Z and assuming it was EOF.

This made it into the CRT library for windows for posix compatibility and exists today ...

 

And so yes, if you try to read a UTF8 encoded "Text File" using windows "Text Mode" in the posix emulated system calls

you will not be able to read a Control-Z or any characters after it.

 

-David

 

----------------------------------------

David A. Lee

dlee@calldei.com

http://www.xmlsh.org

 

 

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS