OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] The subsetting has begun

[ Lists Home | Date Index | Thread Index ]

Elliotte Rusty Harold scripsit:

> If you think any one data model is going to suffice, 
> you're kidding yourself. All they have in common is XML syntax (and 
> not always that since the infoset and DOM can both create 
> non-well-formed documents)

Distinguo.  The Infoset can't "create" anything; rather, infosets are created
from documents.  There is no notion of creating or modifying anything in the
infoset.  It is not a data model in the sense of DOM/JDOM/XOM, despite the
superficial similarity:  it is a minimally abstract representation of
UnicodeWithAngleBrackets syntax, where only silly distinctions are thrown away
(along with most DTD information, omitted from the list below):

   4. White space outside the document element.
   5. White space immediately following the target name of a PI.
   6. Whether characters are represented by character references.
   7. The difference between the two forms of an empty element: <foo/>
      and <foo></foo>.
   8. White space within start-tags (other than significant white space
      in attribute values) and end-tags.
   9. The difference between CR, CR-LF, and LF line termination.
  10. The order of attributes within a start-tag.
  17. The kind of quotation marks (single or double) used to quote
      attribute values.
  18. The boundaries of general parsed entities.
  19. The boundaries of CDATA marked sections.

If your XML application, Walter, depends on any of these facts, then an Infoset
representation of the XML document will not serve you (e.g. a decent XML
editor).  But otherwise, the syntax and the Infoset are indeed twins.

-- 
First known example of political correctness:   John Cowan
"After Nurhachi had united all the other        http://www.reutershealth.com
Jurchen tribes under the leadership of the      http://www.ccil.org/~cowan
Manchus, his successor Abahai (1592-1643)       jcowan@reutershealth.com
issued an order that the name Jurchen should       --S. Robert Ramsey,
be banned, and from then on, they were all         _The Languages of China_
to be called Manchus."




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS