choosing sides

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
choosing sides
From: Michael Sokolov <sokolov@ifactory.com>
To: xml-dev@lists.xml.org
Date: Sun, 12 Dec 2010 07:09:58 -0500
Hearkening back to Elliotte's proposal about forming a group to discuss 
details off-list - did that already happen?  If so, and its focus aligns 
more-or-less with the ideas that got generated over the last week or 
two, I'd be interested in participating. If there's no cabal yet, I 
think it's time to form up (at least one) side, like Amy said

Here's my summary of outstanding ideas, possibly filtered by my 
selective memory, with my own perspective.
I think this list may have some similarity to Pete's blend: 
http://codalogic.com/xmllite/xmllite.html, but perhaps a bit more 
concern for XML 1.0 compatibility?  I guess this would be "Mike's mix" - 
not my ideas, mostly, but my wish list.

For the moment I'll call new xml SXML ("simpler" XML? "super" XML?) :

1) Define a stance on compatibility

   XML 1.0 guarantee - every well-formed XML 1.0 document encoded in
    UTF-8/16 is a well-formed SXML document.  I think that's do-able, 
even with the proposed changes?

    However the converse wouldn't be true.  SXML is looser; it includes 
more documents.

    Can we support a statement like: every SXML document can be 
represented by an "equivalent" XML 1.0 document - the data model is 
essentially the same.  This wouldn't be a perfect round-trip guarantee: 
you might lose prefix mappings, duck-typing and other new features; just 
some kind of translatability guarantee - details to be worked out to see 
if there is a meaningful guarantee that can be had :)

   I suppose another stance that could work is: parsers can support all 
of XML 1.0, XMLNS 1.1, AND SXML - we'd have to design SXML so there 
aren't any outright conflicts.  But it could be a reasonable thing to 
create an SXML parser that lacks support for some XML 1.0 features.  
Maybe there's a "profile" defined in the document itself, as has been 
suggested.

2) New Features

    - new (like Kay-style hierarchic) namespaces - I'm sure there will 
be all kinds of interesting discussion about how this could work out :)

    - looser handling of prolog (allow whitespace)

    - Ignore DOCTYPE (internal DTD set is parsed and preserved (?) for
      re-serialization purposes only) - does SAX have an event for this??
      ignoreableWhitespace maybe?  Not sure how this would play out 
elsewhere?

    - treat XML decl as a PI, but also:
      warn about incompatible character set?; assume utf-8 and
      error when invalid utf sequence encountered

    - duck-typing (provide additional event w/typed values) - this seems 
do-able to me for int, date, dateTime and float.  Possibly even ID for 
anything else that's unquoted?

    - built-in entity set (ISO right?)

    - allow nested comments; no requirement for well-formedness inside 
(use existing syntax) - <xml:comment> is another option.

    - I think CDATA needs to stay for compatibility, but maybe there's a 
SXML-only mode that ignores this?

    - multiple root elements or documents in a single file

    - UTF-8 and UTF-16 autodetected based on BOM (no BOM -> UTF-8)

    - looser handling of ampersand - does it really need to be an error 
to have <a>&foo & &bar;</a>

    - also: undefined entities could be allowed and left unprocessed

    - all whitespace preserved by default (even CRLF, but parsers can be 
configured to do this)

    - end-tag minimization; using </>? possibly only on leaves? <//> for 
close-all-elements?  I don't actually like this last one much, but 
someone did mention lisp's close-all bracket: ], and this syntax just 
sprung into my head...

Note: I don't really intend to start a whole new round of discussion on 
this list, although that may be inevitable, but I'm really hoping a few 
folks want to work out a small manifesto, figure out the implications 
for users and tools and documents, and build some proof-of-concept 
software  - I'm going to go away and work on my SXML parser now :)

-Mike
Follow-Ups:
- Re: [xml-dev] choosing sides
  - From: Dave Pawson <davep@dpawson.co.uk>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]