Re: [xml-dev] choosing sides

In a message dated 12/12/2010 7:23:18 A.M. Eastern Standard Time, sokolov@ifactory.com writes:

Hearkening back to Elliotte's proposal about forming a group to discuss
details off-list - did that already happen? If so, and its focus aligns
more-or-less with the ideas that got generated over the last week or
two, I'd be interested in participating. If there's no cabal yet, I
think it's time to form up (at least one) side, like Amy said

Here's my summary of outstanding ideas, possibly filtered by my
selective memory, with my own perspective.
I think this list may have some similarity to Pete's blend:
http://codalogic.com/xmllite/xmllite.html, but perhaps a bit more
concern for XML 1.0 compatibility? I guess this would be "Mike's mix" -
not my ideas, mostly, but my wish list.

For the moment I'll call new xml SXML ("simpler" XML? "super" XML?) :

1) Define a stance on compatibility

XML 1.0 guarantee - every well-formed XML 1.0 document encoded in
UTF-8/16 is a well-formed SXML document. I think that's do-able,
even with the proposed changes?

However the converse wouldn't be true. SXML is looser; it includes
more documents.

Can we support a statement like: every SXML document can be
represented by an "equivalent" XML 1.0 document - the data model is
essentially the same. This wouldn't be a perfect round-trip guarantee:
you might lose prefix mappings, duck-typing and other new features; just
some kind of translatability guarantee - details to be worked out to see
if there is a meaningful guarantee that can be had :)

I suppose another stance that could work is: parsers can support all
of XML 1.0, XMLNS 1.1, AND SXML - we'd have to design SXML so there
aren't any outright conflicts. But it could be a reasonable thing to
create an SXML parser that lacks support for some XML 1.0 features.
Maybe there's a "profile" defined in the document itself, as has been
suggested.

2) New Features

- new (like Kay-style hierarchic) namespaces - I'm sure there will
be all kinds of interesting discussion about how this could work out :)

- looser handling of prolog (allow whitespace)

- Ignore DOCTYPE (internal DTD set is parsed and preserved (?) for
re-serialization purposes only) - does SAX have an event for this??
ignoreableWhitespace maybe? Not sure how this would play out
elsewhere?

- treat XML decl as a PI, but also:
warn about incompatible character set?; assume utf-8 and
error when invalid utf sequence encountered

- duck-typing (provide additional event w/typed values) - this seems
do-able to me for int, date, dateTime and float. Possibly even ID for
anything else that's unquoted?

- built-in entity set (ISO right?)

- allow nested comments; no requirement for well-formedness inside
(use existing syntax) - <xml:comment> is another option.

- I think CDATA needs to stay for compatibility, but maybe there's a
SXML-only mode that ignores this?

- multiple root elements or documents in a single file

- UTF-8 and UTF-16 autodetected based on BOM (no BOM -> UTF-8)

- looser handling of ampersand - does it really need to be an error
to have &foo & &bar;

- also: undefined entities could be allowed and left unprocessed

- all whitespace preserved by default (even CRLF, but parsers can be
configured to do this)

- end-tag minimization; using </>? possibly only on leaves? <//> for
close-all-elements? I don't actually like this last one much, but
someone did mention lisp's close-all bracket: ], and this syntax just
sprung into my head...

Note: I don't really intend to start a whole new round of discussion on
this list, although that may be inevitable, but I'm really hoping a few
folks want to work out a small manifesto, figure out the implications
for users and tools and documents, and build some proof-of-concept
software - I'm going to go away and work on my SXML parser now :)

-Mike

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php