OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Syntax Sugar and XML information models


> -----Original Message-----
> From: John Aldridge [
> Sent: Thursday, March 29, 2001 5:35 AM
> To: xml-dev
> Subject: RE: Syntax Sugar and XML information models

> (b) Editors all write a _standard_ normal form (i.e. not just
> a normal form  of their own choosing)

This is more or less what I was hoping we could collectively define, and
"standard normal form" sounds a lot better than "Syntax Sugar Information
Set."  And to answer Rick Jelliffe's question, I agree that the W3C InfoSet is
a reasonable model for what people care about when navigating or transforming
a document, but we need a richer model for editors and databases. These are two
halves of the same coin, since a database must round-trip whatever is significant
to an editor, and an editor must preserve whatever is significant to a database).
BUT I'm not sure I agree "that means you are *not* interested in the information set of the document, but the actual text of the document's entities. That is a fine thing. Let there be element-based (infoset) editors and entity-based (tag-aware) editors".  Databases (and arguably editors) *should* be interested in the information set of a document rather than just the bytes that make it up, but they need a richer information set than the W3C InfoSet.

I'm hoping to find a middle ground between "editors and databases must simply
round-trip the (core) infoset" and "editors and databases must round-trip every
single character".  My first cut at this is that the "standard normal form" is
Canonical XML + external entity references + CDATA sections ... 

I'm sure there is more.

As for the order of attributes, doesn't XML 1.0 specifically declare this to be