[
Lists Home |
Date Index |
Thread Index
]
----- Original Message -----
From: "Mike Brown" <mike@skew.org>
To: <xml-dev@lists.xml.org>
Cc: <NBEYER@cerner.com>
Sent: Wednesday, March 20, 2002 7:58 PM
Subject: Re: [xml-dev] CDATA vs. Escaped characters
> 4. Simple byte counts might be an issue. Using a CDATA section can cut
down
> the space required to store or transmit those portions of a document that
> would otherwise be riddled with numerous escaped characters. On the other
> hand, many small, unnecessary CDATA sections can add unnecessary bulk to
the
> size of the document.
>
> - Mike
There is also a good ol' technology that would work as an alternative here
if the content would be riddled with a large percentage of escaped
characters: BASE64 encoding. Yes, the content would inflate by 33% (4 bytes
to store 3 characters). No, your parser wouldn't decode for you
automatically (unless you wrote the parser). However, this can be
considerably less (byte counts) than inflation from escaped characters.
Also, processing might be a bit easier here, since only one DOM node or SAX
event would occur for the encoded content, and the block can be quickly
decoded. Of course, it becomes necessary to denote in the XML whether the
content is or is not encoded, or to have it understood that the content is
always to be encoded. As with any handling of character data in XML, there
are trade-offs...
My suggestion is to add <![BASE64[ ]]> to XML. :)
---
Seairth Jacobs
seairth@seairth.com
|