[
Lists Home |
Date Index |
Thread Index
]
At 2:39 PM +0000 2/27/03, Alaric B. Snell wrote:
>No, no, relax; in the scheme I suggested, it uses the type information as
>*advisory* data on how to encode stuff. It could ignore it all and just put
>everything in as strings. This is what people refer to as a "Schema-aware
>compressor".
I'm not sure what you're suggesting is what Robin's suggesting. Right
now I'm very confused (not that unusual or even unproductive a state
to be in, though :-)
Type-awareness strikes me as fundamentally dangerous, especially in
this sort of scheme. Yes, it could encode everything as strings but
given that it doesn't, how do I decode? Do I have to have the schema
handy? or is the type information bundled into the binary file?
Also, how are elements like these encoded?
<quantity type="xsd:int">000017</quantity>
<quantity type="xsd:int">17.00000</quantity>
<quantity type="xsd:int">17.5</quantity>
<quantity type="xsd:int">two</quantity>
<quantity type="xsd:int">2 and not a fnord</quantity>
<quantity type="xsd:int>Cheesy Poofs</quantity>
I expect any plausible binary compression scheme to be lossless with
respect to the infoset, not the PSVI mind you but the I. I don't
expect to lose any significant data just because:
1. The data is invalid
2. I happen to use a different schema for decoding than you used for encoding
If the binary compression fails these tests, I cry shenanigans on you. :-)
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| Processing XML with Java (Addison-Wesley, 2002) |
| http://www.cafeconleche.org/books/xmljava |
| http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.cafeconleche.org/ |
+----------------------------------+---------------------------------+
|