Lists Home |
Date Index |
On Sat, 2002-03-23 at 12:17, Tim Bray wrote:
> Amelia A Lewis wrote:
> > In short, the C0 characters have no universal interpretation;
> > interpretation depends upon the application. It seems reasonable, then,
> > that the application can encode the bloody things too. Can't use XML
> > mechanisms. Base64, the usual suggestion, incurs an immense overhead.
> I agree with the leading sentences. As for the last, Base64 encodes
> 3 bytes as 4, thus incurring exactly 33% overhead. Whether that
> is considered "immense" depends on your application scenario. -Tim
A little more than that, actually, in a correct base64 implementation.
Each 57 bytes become 76 bytes. Add two more for CRLF. Plus the final
padding, which is generally but not always negligible. Lessee ... I'd
do the math, but I'm not working today, so it's lazy time: original +
1/3 + 1/57. For decoding, 1 + 1 + 1/3 + 1/57, most likely, as you
prolly can't discard as you decode.
If "immense" is overwrought, could we agree on "significant"? Tricks
like quoted-printable and encoded-word (and XML unicode numeric
entities) are attractive largely because the characters they encode are
*rarely* encountered, meaning that the cost is significantly less than
(who's spent the last two weeks writing MIME-related code, and is
probably being hideously pedantic)
Amelia A. Lewis email@example.com firstname.lastname@example.org
Yankees are compelled by some mysterious force to imitate Southern
accents and they're so damn dumb they don't know the difference beween
a Tennessee drawl and a Charleston clip.
-- Rita Mae Brown, "Rubyfruit Jungle"
This is a digitally signed message part