xml-dev - RE: [xml-dev] What is the rule for parsing XML in a namespaceinside HTML

RE: [xml-dev] What is the rule for parsing XML in a namespaceinside HTML

[ Lists Home | Date Index | Thread Index ]

To: "Joshua Allen" <joshuaa@microsoft.com>
Subject: RE: [xml-dev] What is the rule for parsing XML in a namespaceinside HTML?
From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Wed, 14 Jul 2004 16:41:18 -0400
Cc: "XML Developers List" <xml-dev@lists.xml.org>
In-reply-to: <0E36FD96D96FCA4AA8E8F2D199320E52025FA5EB@RED-MSG-43.redmond.corp.microsoft.com>
References: <0E36FD96D96FCA4AA8E8F2D199320E52025FA5EB@RED-MSG-43.redmond.corp.microsoft.com>

At 10:06 PM -0700 7/13/04, Joshua Allen wrote:

>Well, the "enforced by the browser" is what I'm having trouble with.
>Most XML is not intended to be processed by a web browser.  HTML is for
>rendering in user-agents, XML is for processing by some data interchange
>program without even a UI, importing into a contacts database, consuming
>in a news aggregator, etc.  I think it would be overkill to expect a web
>browser to enforce my PurchaseOrder schema just as it would be overkill
>to expect the Biztalk app to enforce HTML rules on a payload.

You persist in seeing this as two different things, which is twice as 
much work.  People want to read web pages in their browsers. They 
also want to be able to process it with off-the-shelf and custom 
tools. It's very useful to import web pages into contacts databases, 
consume web pages in a contacts database, and more. There's no reason 
web poages should be limited to human browsing exclusively. Machine 
processing is greatly facilitated if the web pages are well-formed. 
Validity is not required though. I agree there's no reason for the 
browser to enforce some purchase order schema. That doesn't mean the 
purchase order document, in either XML or XHTML, shouldn't be 
well-formed.

>In cases where you have something like VML or SVG which actually *is*
>intended to be rendered in a user agent, then I agree.  (And I realize
>this is your specific case -- I am just arguing against the general
>case)  But I don't see why an actual HTML *envelope* would have to be
><?xml...?> in order to embed payloads that were intended to be pulled
>out and parsed with an XML processor.  I think something like the <xml>
>tag hack that IE uses is just fine, and it would be ideal if all XML
>payloads such as SVG and VML are embedded in HTML 4.x using this
>convention.  You could additionally stipulate that the browser should
>enforce wellformedness inside the <xml> tags.  That would be a good
>convention and would deserve support, IMO.

The problem here is you're assuming only some of the content should 
be made accessible to XML parsers and machines. It's far more useful 
to make all of it accessible. Don't place arbitrary limitations on 
what the machines can consume. Don't require page authors to provide 
the same information twice, once for humans and once for machines. 
Let both the humans and machines consume the same data.

-- 

   Elliotte Rusty Harold
   elharo@metalab.unc.edu
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA

References:
- RE: [xml-dev] What is the rule for parsing XML in a namespace inside HTML?
  - From: "Joshua Allen" <joshuaa@microsoft.com>

Prev by Date: RE: [xml-dev] Tools for validating complexTypes derived by restriction
Next by Date: Groucho Marx on patents (kinda)
Previous by thread: RE: [xml-dev] What is the rule for parsing XML in a namespace inside HTML?
Next by thread: RE: [xml-dev] What is the rule for parsing XML in a namespace inside HTML?
Index(es):
- Date
- Thread