OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help




[ Lists Home | Date Index | Thread Index ]

> -----Original Message-----
> From: Bob Foster [mailto:bob@objfac.com] 
> Sent: Sunday, April 10, 2005 16:45
> To: Amelia A Lewis
> Cc: xml-dev@lists.xml.org
> Subject: Re: [xml-dev] Non-infoset
> Amelia A Lewis wrote:
> > On 2005-04-10 15:25:51 -0400 Alessandro Triglia 
> <sandro@mclink.it> wrote:
> >> - the nature and amount of whitespace inside tags;
> > 
> > What?
> > 
> > <p>This sentence contains <em>emphasized</em> text.</p>
> > 
> > Has important characteristics in the whitespace.
> Yes, but there isn't anything important about the whitespace 
> within tags:
> <p   >These tags contain <em>superfluous</em> whitespace.</p  >
> I agree with the rest of your comments.
> Even whitespace within tags has its uses, e.g., to break up very long 
> lines where adding whitespace outside tags would change the, 
> uh, infoset.
> I personally think the distinction comes down to 
> hand-authored XML vs. 
> program-generated XML. The latter would mostly be satisifed with 
> exchanging infosets. The former, mostly not.
> Here are the things human authors _need_ that aren't in the Infoset:
> 1) A way to include other documents.
> 2) A way to specify characters that aren't directly supported by the 
> author's editor.
> 3) Comments. Documents are read by other humans, too.

Comments are part of the infoset.

> 4) Validation. (There's no way one could ever produce a 
> DocBook document 
> if there weren't some automatic way to check it.)
> I'm kind of amused by Allesandro's harping on the need for a 
> choice of 
> attribute value delimiters. Ok, you're about to write a value that 
> contains the " character. There are two ways to do it: 
> Delimit the value 
> with ' characters (that choice we don't need) or escape the 
> value with 
> &quot; (those entities we don't need).

I have never said that you can create an XML 1.0 document without using predefined entities or apostrophes  or numeric character references.

My question was:  How relevant is that choice for the creator (does it convey any information in your intentions)?  In other words, would you be happy if that choice was put outside your control (say, if it was made automatically and rigidly by some tool)?  

(Maybe you would not be happy, but how many XML creators would be? -- especially those that are **programs**?)

I'll reword my question about general entities as follows:  Suppose you included a general entity reference somewhere (say, as shorthand for a long phrase defined as an entity).  Does the presence of that general entity reference in the document (as opposed to the expansion of the entity reference) convey any relevant information in your intentions?  In other words, would you be happy if the expansion of a general entity was put outside of your control (say, if it was made automatically and rigidly by some tool)?

What I am trying to determine is to what degree those "choices" of XML 1.0 are relevant to a creator **in that they convey information in the creator's intentions**.  (I am not suggesting in any way that XML 1.0 does not need those choices!)  If they are irrelevant to a sufficient number of creators and in different situations, then the infoset is important.

The existence of a large number of people that need to see entity references unexpanded and need to see CDATA sections as such and need to play with whitespace inside tags etc., does not prove that the infoset is not important -- unless they were the vast majority of creators, which I seriously doubt.


> Not a problem for programs, but there are still a few of us 
> humans out 
> there, scribbling away.
> Bob Foster


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS