[
Lists Home |
Date Index |
Thread Index
]
At 3:03 PM +1000 7/26/02, Rick Jelliffe wrote:
>In http://www.w3.org/TR/newline the three use cases are:
>
Looking at this document, I note that its author(s) had some serious
misconceptions about XML. For example, they state:
Well-formed but invalid - because the [NEL] character appears in
element content:
<a>[NEL]<b/>[NEL]</a>
where the corresponding DTD contains
<!ELEMENT b EMPTY> <!ELEMENT a (b)>
In fact, however, the second example is invalid with or without
allowing NEL. Valid elements declared empty may not contain white
space.
A similar misconception is seen later when the authors state:
\n printf output: OS/390 C or Java program [NEL]
This may be true in C. It is not true in Java. In Java \n always
results in a linefeed. If it's producing a NEL on OS/390, then the
OS/390 JVM is not conformant to the Java spec either.
>> Using native system string functions, such as atoi and atof, to
>>convert XML strings, documents, or fragments, to other data types
This really goes to the heart of the problem: atoi and atof are ASCII
functions that are simply not suitable for Unicode-based XML
regardless of what we do with NEL. The atof() signature is:
double atof(const char \nptr);
It's been a while since I've written C, but my recollection is that
the char type is always one-byte wide. Processing XML in C requires
using different kinds of wide chars and wide string types. You can't
use native system string functions to work with XML data because XML
data is Unicode, not ASCII. For instance, in the Apache Xerces-C DOM
"String is represented by 'XMLCh*' which is a pointer to unsigned 16
bit type holding utf-16 values, null terminated." Other schemes are
possible. However, you simply cannot use C's traditional 1-byte
strings and characters and their associated functions. This is not an
OS/390 issue. It is a C issue. The same is true on Windows, Mac OS,
Unix, and every other platform that uses C.
All of the other functions we're talking about are similar. Even with
NEL, you still shouldn't be using these to process XML. OS/390 needs
to get some modern libraries. XML does not need to change. If
mainframe programmers think that NEL is the only problem they have,
they are sorely mistaken. IBM is asking us to break XML for many
thousands of users for something that won't even fix their own
problems. Short of moving XML to ASCII (a solution we all rightly
abhor), the only way to solve the OS/390 problem is to fix OS/390.
XML *cannot* be fixed enough to make XML usable on OS/390 in the way
IBM wants.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| XML in a Nutshell, 2nd Edition (O'Reilly, 2002) |
| http://www.cafeconleche.org/books/xian2/ |
| http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
|