[
Lists Home |
Date Index |
Thread Index
]
At 6:53 PM +1100 12/19/01, Alan Kent wrote:
>Is there a good reason why PCDATA should not be allowed to contain the
>bell (BEL) or escape (ESC) character? Why is it always evil to allow
>terminal escape sequences to be embedded in an XML document? Why is it
>important for XML to *not* permit it to be used for things that it was
>not originally intended for?
>
Among other reasons, because these characters can have unexpected
consequences when presented on dumb terminals and terminal emulators,
passed through gateways and routers, and in other systems. None of
the kinds of things that have these problems are very common today,
but they're out there; and they are used. (I just used a terminal
emulator to check the version of emacs for a different thread.)
>Hmmm. I must admit I just confused myself. Looking at the Unicode 3.0
>book we have, it defines C0 controls (0x00-0x1f) and C1 controls
>(0x80-0x9f). The XML 1.0 spec says it allows "any Unicode character
>excluding the surrogate bocks, FFFE, and FFFF". It then lists #x9, #xA,
>#xD, #x20-#xD7FF etc. So why are *some* C0 control Unicode chars
>"not Unicode", but *all* C1 control Unicode chars "are Unicode"?
>It seems totally self-contradictory and bizare to me.
It is, and this is an admitted mistake in XML 1.0. The C1 controls
should have been banned like the C0 controls. It was an oversight
that they weren't.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.ibiblio.org/xml/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/ |
+----------------------------------+---------------------------------+
|