OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] best practice for providing newsfeeds ?

[ Lists Home | Date Index | Thread Index ]


>-----Original Message-----
>From: Bob Wyman [mailto:bob@wyman.us] 
>Sent: Monday, February 02, 2004 2:01 PM
>To: Joshua Allen; 'Michael Champion'; 'XML DEV'
>Subject: RE: [xml-dev] best practice for providing newsfeeds ?
>
>Joshua Allen wrote:
>> OK, is it fair to paraphrase this argument as "RSS works fine today, 
>> but Atom will enable you to be more flexible in seizing new 
>> opportunities tomorrow?"
>	No. That's not fair. RSS does *not* work fine today.

Interesting, it works fine for me and most of the folks I know at work
that use news aggregators. 

> It 
>is a mess and we pay for that mess in every tool that reads RSS.

That I agree. Unfortunately adding yet another format to the mix with
its own set of arbitrary complexity doesn't seem like the best way to
solve the proble,/ 

>	1. Virtually every RSS reader actually reads at least 
>three different flavors of RSS and often as many as seven. All 
>of the these various formats are underdefined in one way or 
>another. Agreement on what their elements mean is only 
>approximated through a process of voluminous and expensive 
>back-channel communications. (Note: The RDF guys *think* that 
>their formats are well defined, however, the reality is that 
>any format that relies on random namespaced extensions can 
>hardly be called "defined"...)

A.) Every agregator will now have to support 3 to 7 flavors of RSS and
ATOM. Doesn't seem like a step forward to me. 

B.) The ATOM folks rely on arbitrary link tags to support extensibility,
I'd hardly call that "defined" either.

C.) As for being underdefined, I've seen this tradition continued in the
various draft ATOM specs I've seen which is why I haven't implemented it
in the application I work in my free time.  

>	2. Common forms of RSS are missing the ability to 
>express important concepts. For instance, RSS only has a 
>"pubDate" field to say when an item was created. However, when 
>people update items, they typically keep the old pubDate value 
>in order to maintain the item's order in their blogss. The 
>problem, of course, is that this means that "date" can't be 
>used to imply "age" or even "sequence" in any useful way since 
>entries being modified today still carry the date of their 
>original creation. (Atom defines both "created" date and "issued"
>date. This allows the distinction to be made.)

Most aggregators have certain heuristics that they use to determine if
an item has changed. Having a well-defined way to specify whether a post
has been updated is useful although how that translates to attaching
three dates to an entry instead of two is beyond me. 

>	3. There are very few commonly shared conventions for 
>encapsulating HTML content in RSS feeds. Sometimes you get a 
>CDATA, sometimes the tag soup is just inserted into an 
>element, sometimes, it comes with enclosing <html> tags, 
>sometimes, it doesn't. Sometimes, it is split up into two 
>elements (a <description> and a <content> element...).
>	I could go on... but that would be depressing.
>	On my site, we constantly process data from over 1 
>million RSS feeds (Yes, we've grown a bit in the last 
>week...)and what we see is a cesspool of badly formed data. It 
>is "just" barely clean enough to do minimally useful work with 
>it, but that isn't saying very much. 

I find it hard to imagine that if I (who's primary day job definition is
more paper pusher than developer) can figure out the two primary ways
HTML content is encapsulated in RSS feeds [escaped in description or
content:encoded elements or not in xhtml:body elements] I don't see why
others complain about it being rocket science. On the other hand to
claim that the combination of specifying MIME types and escaping modes
in ATOM somehow makes this easier is dubious at best.

>	RSS does *not* work today. The best you can say is that 
>RSS gives us a clear hint of what a "working" system might 
>look like. Atom is intended to turn that hint into a reality.

Technorati, Feedster, and the thousands of people using news aggregators
today belie your claim.

--
PITHY WORDS OF WISDOM 
The heaviest burden a man can carry is a chip on his shoulder.


This posting is provided "AS IS" with no warranties, and confers no
rights.  




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS