[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Images embedded in XML
- From: Danny Ayers <firstname.lastname@example.org>
- To: "Al B. Snell" <email@example.com>
- Date: Sun, 08 Apr 2001 12:22:47 +0600
I'll happily agree with you that XML does seem to have been a real success
in documentation. I can't deny also that you could put together a
self-describing binary format as you suggest. I would suggest that for such
a format to be really useful, a tree-based model would be preferable to
flat/relational model. You will need to convert between the serialized form
and other forms (e.g. an in-memory tree), which if the format is not to be
too rigid would in effect be a kind of parsing - though admittedly you could
do it n times fasting than e.g. SAX. Ok, so if you put all this together,
what would you be gaining? Say an order or two of magnitude of speed? (and
the same kind of gains for data storage) What would you be losing?
Human-readability - I for one wouldn't lose any sleep over that.
Compatibility with visual representation systems (XML/XSL/XSLT/XHTML etc.) -
this is hugely useful for a not inconsiderable range of applications, but
could be replaced by a standard set of conversion tools XML <-> XDF. A huge
range of interfaces & systems...but we could live with that.
So why not? One big reason - there isn't a commonly accepted standard. Ok,
XML has major faults, SOAP is downright ugly etc. etc. but at least XML is
spoken everywhere. A standard that can be built on top of and worked around.
We can solve the real-world problems, ok in a sub-optimal way, but surely
that's all we really need. Do we want systems that will be 1000x more
efficient tomorrow, or ones that may be slow and clunky but actually work
with each other *today*?
Maybe a binary format will come along and be accepted worldwide - but given
the current climate I think it's highly unlikely in the near future. I think
we'll be looking at XML & kludges for some time to come.
<- -----Original Message-----
<- From: Al B. Snell [mailto:firstname.lastname@example.org]
<- Sent: 08 April 2001 07:55
<- To: Danny Ayers
<- Cc: The Deviants
<- Subject: RE: Images embedded in XML
<- On Fri, 6 Apr 2001, Danny Ayers wrote:
<- > <- Hmmm, it'd be nice if XML was more compact, faster to
<- parse, and let you
<- > <- embed other data streams more easily - do we *have* to make it
<- > <- human-readable UTF-8?
<- > It wouldn't be extensible or markup - I suppose you could just
<- call it 'L'
<- It would be extensible, since any XML document could be encoded in it and
<- vice versa; and it'd be a data representation language as opposed to a
<- markup language being used for data representation... I'd prefer it to be
<- simple enough to work with to make it a "format" rather than an entire
<- "language", so let's called it XDF :-)
<- > generation and reading another. All are done with binary
<- formats in lots of
<- > systems, but to be able to use these completely across the
<- board you need to
<- > go to a pretty low common denominator such (e.g. plain text).
<- This is a bit of a myth... plain text is less of a common demoninator
<- than two's complement or unsigned binary integers, since plain text is
<- described in *terms* of this, and XML is far from a simple "lowest common
<- denominator" data format; it would take me a few minutes to
<- write a set of
<- routines in C to serialise and unserialise data in network byte
<- order, and
<- perhaps an hour at most to implement this for a self-describing
<- format. Compare that to how long it takes to implement an XML parser, and
<- the size and run time space/time requirements, and the size of the XML
<- documents compared to the "XDF" records...
<- Since parsing XML is complex, XML's adoption will be limited by the
<- development rate of XML parsers... I have heard a few people say "Ah, I
<- can write an XML parser in 10 lines of Perl", but those parsers don't
<- process entity references or handle namespaces :-)
<- > Either you do
<- > without compatibility between systems or you sacrifice a bit of speed.
<- Only a tiny smidgen... many CPUs can do network / host byte order
<- translation in a single instruction; compare that to the time
<- taken for an
<- XML parser to locate a text node by stepping through to find delimeters,
<- then removing the whitespace and converting from UTF8-decimal to an
<- integer in host byte order...
<- > There
<- > are compromises though - you could have a reference in your markup to a
<- > binary file (e.g. a .jpg) and the processor could receive this
<- > from the markup, as in HTML browsers.
<- Yes, but this is a kludge; XML isn't good enough to realistically embed
<- binary objects inside it, so they have to go over a seperate connection
<- with some complex referencing mechanism.
<- XML is posing as something suitable for forming the core of many systems
<- of communicating software modules, yet it is incredibly unweildy compared
<- to much simpler to use and implement techniques of precisely the same
<- expressive power; I don't want to start a flamewar, but is this
<- *really* a
<- wise application of XML? Shouldn't it stick with replacing
<- with XML/XSLT/XHTML and remain in the realm of documentation systems,
<- which it is much more applicable, than all this XML-RPC and SOAP
<- > Cheers,
<- > Danny.
<- Alaric B. Snell
<- http://www.alaric-snell.com/ http://RFC.net/
Any sufficiently advanced technology can be emulated in software