> >Yet Notepad could read these "Text Files" just fine. Of course it can, because there is no such problem. ----------- The problem is not with UTF8. The problem is with the Windows "text mode" option to the posix emulated File open methods, such as fopen(), _open() etc. This mode tries to help people write "posix compatible" code by filtering out window-isms and turning them into unix-isms. One nice thing this does is convert CF/LF to LF .. .(oh so HARD ...)( One *horrid* thing it does is assume you're running like a DOS 1.0 filesystem where Control-Z was actually used as EOF so when read() if used with this "text mode" in windows encounters a control-Z it returns -1 ... EOF.
When the filesystem upgraded some decades ago ( I dont know when exactly .... ) but by then there was a convention to actually
write Control-Z literally to text files to indicate the end of file. Then there was a transient period where the filesystem itself didnt care about the control-z but many text oriented programs started looking for control-Z and assuming it was EOF. This made it into the CRT library for windows for posix compatibility and exists today ... And so yes, if you try to read a UTF8 encoded "Text File" using windows "Text Mode" in the posix emulated system calls
you will not be able to read a Control-Z or any characters after it. -David ---------------------------------------- David A. Lee |