OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
XML 2.0/Green or XMLE

XML 2.0/Green or XMLE

Following on Pete Cordell's XML Lite idea, I suggest a slight
varation with a focus on what might be the "killer app" to drive
adoption: energy-efficiency for mobile and embedded applications.
(I just now see James Clark's MicroXML, also on similar lines).

From the energy-efficiency perspective, a primary goal is to
reduce the number of processor cycles that need to be spent
on parsing, including taking advantage of processor technology
trends (SIMD and multicore parallelism, in particular).

Given this perspective, it may be worth defining a variant
of XML (say XMLE, where E stands for Embedded, or Energy-efficient,
or Environmental) rather than a successor to XML 1.0.   Whereas
an XML 2.0 may need to retain some features of XML 1.0 (e.g.,
internal DTDs) for some important subclasses of nonembedded
applications, XMLE might involve more radical simplification.

Building on many of the ideas presented on the list and in
Pete Cordell's XML Lite document, here is a list of possible
simplifications, each of which will reduce cycles spent on
parsing, as well as reducing complexity for building XMLE

Class A:  Restrictions that preserve XML 1.0 well-formedness
(every XMLE document is also an XML 1.0 document).

1.  Eliminate DTDs - both internal and external
       (also eliminate standalone declarations)
2.  Eliminate predefined entities
       (use character references instead).
3.  Eliminate decimal character references in favor of
      hexadecimal.  (Multiple representations add parsing cost
      for no representational benefit; hexadecimal references
      are best for simple and efficient conversion, including
      parallel conversion.)
4.  Eliminate line break normalization costs, e.g. by making
    CR illegal in XMLE (there are some other possibilities).
5.  Require UTF-8; eliminate encoding declarations.
6.  Require the use of double quotes and eliminate single
    quotes for attribute values and the like.   There is a
    parsing cost and no real representational gain in allowing
    both forms.
7.  Eliminate CDATA sections.
8.  Eliminate comments in favor of processing instructions,
    possibly with a predefined target.   (Or eliminate PIs
    in favor of comments - either change has significant performance
    benefits.   But processing instructions are used for
    php and other scripting technologies.)

Class B: Changes that will require a tool for conversion
    from XMLE to XML 1.0.   Making any one of these changes
    thus has a significant downside, because XMLE documents
    will not be accepted by existing stacks.   However, the
    performance benefits may be worthwhile.

1.  Eliminate element names in end tags; use the form "</>"
    and possibly </n>, where n is nesting level.
    (This change will considerably compress XML documents,
     and may be enough to justify this whole class of changes.)
2.  Allow ]]> in text.

Robert D. Cameron, Ph.D.
Chief Technical Officer, International Characters, Inc.
Professor of Computing Science, Simon Fraser University

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS