[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
XML 2.0/Green or XMLE
- From: Rob Cameron <robc@international-characters.com>
- To: xml-dev <xml-dev@lists.xml.org>
- Date: Mon, 13 Dec 2010 07:20:40 -0800
XML 2.0/Green or XMLE
Following on Pete Cordell's XML Lite idea, I suggest a slight
varation with a focus on what might be the "killer app" to drive
adoption: energy-efficiency for mobile and embedded applications.
(I just now see James Clark's MicroXML, also on similar lines).
From the energy-efficiency perspective, a primary goal is to
reduce the number of processor cycles that need to be spent
on parsing, including taking advantage of processor technology
trends (SIMD and multicore parallelism, in particular).
Given this perspective, it may be worth defining a variant
of XML (say XMLE, where E stands for Embedded, or Energy-efficient,
or Environmental) rather than a successor to XML 1.0. Whereas
an XML 2.0 may need to retain some features of XML 1.0 (e.g.,
internal DTDs) for some important subclasses of nonembedded
applications, XMLE might involve more radical simplification.
Building on many of the ideas presented on the list and in
Pete Cordell's XML Lite document, here is a list of possible
simplifications, each of which will reduce cycles spent on
parsing, as well as reducing complexity for building XMLE
tools.
Class A: Restrictions that preserve XML 1.0 well-formedness
(every XMLE document is also an XML 1.0 document).
1. Eliminate DTDs - both internal and external
(also eliminate standalone declarations)
2. Eliminate predefined entities
(use character references instead).
3. Eliminate decimal character references in favor of
hexadecimal. (Multiple representations add parsing cost
for no representational benefit; hexadecimal references
are best for simple and efficient conversion, including
parallel conversion.)
4. Eliminate line break normalization costs, e.g. by making
CR illegal in XMLE (there are some other possibilities).
5. Require UTF-8; eliminate encoding declarations.
6. Require the use of double quotes and eliminate single
quotes for attribute values and the like. There is a
parsing cost and no real representational gain in allowing
both forms.
7. Eliminate CDATA sections.
8. Eliminate comments in favor of processing instructions,
possibly with a predefined target. (Or eliminate PIs
in favor of comments - either change has significant performance
benefits. But processing instructions are used for
php and other scripting technologies.)
Class B: Changes that will require a tool for conversion
from XMLE to XML 1.0. Making any one of these changes
thus has a significant downside, because XMLE documents
will not be accepted by existing stacks. However, the
performance benefits may be worthwhile.
1. Eliminate element names in end tags; use the form "</>"
and possibly </n>, where n is nesting level.
(This change will considerably compress XML documents,
and may be enough to justify this whole class of changes.)
2. Allow ]]> in text.
Robert D. Cameron, Ph.D.
Chief Technical Officer, International Characters, Inc.
Professor of Computing Science, Simon Fraser University
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]