[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] An efficient, safe, extensible XML data design ...mimicking in XML a binary data format
- From: Thomas Passin <list1@tompassin.net>
- To: xml-dev@lists.xml.org
- Date: Thu, 26 Mar 2015 18:47:57 -0400
On 3/26/2015 4:12 PM, Costello, Roger L. wrote:
Consider this scenario: you have installed a device that monitors the
data that flows through a router. With that device you record
information about the flow, e.g.,
<Flow-data> <Number-of-bytes>500</Number-of-bytes>
<Source-IPv4-address>129.87.74.0</Source-IPv4-address>
<Destination-IPv4-address>129.87.75.0</Destination-IPv4-address>
</Flow-data>
... [more complex examples snipped]
I think this is an awful misuse of XML. You're creating an extremely
verbose format that will be prone to parsing errors if anything goes
wrong. The data element values and codes have to be kept in their
correct sequence, so that if any code value or data value accidentally
gets omitted (could happen), all subsequent values will be corrupted.
The format is extremely verbose for the kinds of data values it conveys.
You say "It uses integer codes (very efficient) rather than string
element descriptors (very inefficient ..." but the length of the code
values is trivial compared with the number of characters for the element
start and stop tags, so that's pretty well irrelevant. The rest of the
format is very inefficient, at least so far as the ratio of data to
boilerplate is concerned.
"Flow data" sounds like it should be able to stream as long as one
wants, but this format won't be able to do that.
Seems to me that this data would do better using JSON. Or if there is a
*lot* of data, some binary encoding of ASN.1, perhaps. And it takes a
lot to get an old XML hand like me to say something like that!
Sticking with XML anyway, the way you are showing
<data><value>.....<value><value>....</data>, you might as well omit the
<value> element and just use <data> for each value. That would simplify
it a little. Of course, if you need to specify the units for the
values, you're back to needing nested <value> elements (or else you
could use attributes, as in <data unit='m'>).
It would be helpful if you explained what "flow data" means in this
context, what data sizes you expect, if it needs to be parsed as a long
indeterminate stream, how many data elements there will likely be ...
In other words, some basic practical requirements. After all, if this
is a one-off for parsing 100 elements, this or most anything will do.
If a message will contain 100,000,000 data values month after month,
it's a whole different thing.
TomP
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]