OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] rss regularis(z)ation

[ Lists Home | Date Index | Thread Index ]

bryan wrote:

> One of the things I would want to use namespaces for is to return
> namespaced html instead of as you pointed out " the bizarre practice of
> CDATA-escaping random HTML-ish text " but this is only starting to be
> done now, why was it not done in earlier versions? What were the excuses
> for the bizarre practice

I agree that it's bizarre and offensive, but these people are not 
completely nuts.  Think of it from the point of view of the aggregator 
writer.  They want to parse an RSS feed as XML, and they want to parse 
each entry to get the <title> and <author> and <link> and so on.  Then 
they get to the content.  They have an HTML renderer which will render 
this prettily.  So they want to take all the bytes between <content> and
</content> (those are atom tags, not RSS tags, but same difference), and 
hand them to the HTML renderer.  They don't want to parse them, because 
they'd just be doing a no-op and putting them back to together again to 
hand them to the renderer.

On the producer's side, a lot of the authoring tools give authors a lot 
of freedom in whatever editing tool they like, and to enforce that this 
be XHTML is a lot of extra work that's not done yet.

So both the producers *and* the consumers are happier using this 
horrible escaped-HTML stuff.  I and several others have told them that 
they shouldn't want to do this, but it doesn't seem to work.

As several others have pointed out, if the content were well-formed they 
could do XPath magic, and filter out dangerous things like <script>, and 
bask in the glow of karmic goodness.  In response they say "I don't want 
to do XPath magic, and my HTML renderer has a safe-sandbox mode, and I 
just want the stuff I care about (<title>, <link>, remember) in XML and 
the rest is a bag of bits, so extend me no markup.

Realistically, I think we're stuck with it.  At least Atom will *let* 
you make the content well-formed.  Then evolution takes over.
-- 
Cheers, Tim Bray
         (ongoing fragmented essay: http://www.tbray.org/ongoing/)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS