xml-dev - Re: [xml-dev] RE: evolvable formats

Re: [xml-dev] RE: evolvable formats

[ Lists Home | Date Index | Thread Index ]

To: Mike Champion <mc@xegesis.org>, xml-dev@lists.xml.org, "'Mark Pilgrim'" <f8dy@diveintomark.org>
Subject: Re: [xml-dev] RE: evolvable formats
From: Joe Gregorio <joe@bitworking.org>
Date: Thu, 10 Oct 2002 13:07:05 -0400
References: <MOLVUVTC7MGELIF0RM3YVTXSHG725Z.3da583c2@MChamp>
User-agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.2a) Gecko/20020910

Mike Champion wrote:
> Referring generally to http://lists.xml.org/archives/xml-dev/200210/msg00583.html
> but not quoting at length.
> 
> I am not an RSS weenie, but I am very interested in it as a case study
> in "evolveable formats", i.e., how real users deal with the cruel reality
> that application-level XML standards are produced slowly, by political
> and interpersonal processes that seldom yield fully satisfactory
> results, and which are obsolete the day they are cast in a schema 
> and/or given a namespae :-).  XML doesn't fix human frailty, it just
> reduces the technological overhead that hid human issues behind 
> syntax and interoperability problems (e.g., EDI standards AFAIK).
> 

> FWIW, I take away the following lessons from the RSS 0.9x/1.0/2.0/3.0/etc.
> experience (which again I'm happy not to have lived through), and
> would appreciate some responses from people who lived through it
> firsthand.

I am not an old timer at RSS myself, I missed the first great RSS 
war, but I did see the second and I have published an RSS reader: 
http://bitworking.org/Aggie.html
Working in it has been quite an eye opening experience.

> 
> 1 - Politics happens, Evolution is continuous, deal with it.  
>   With technology, as best you can.  Don't make technology choices
>   that are fragile in the face of human nature.

I would add: "Don't make technology choices
that are fragile in the face of the currently available
toolsets." For example, using RDF in RSS 1.0.

> 
> 2 - Namespaces - work best for mixing instances of well-defined
>   vocabularies/schemas together, they don't work so well to support
>   evolution or un-typed XML. Schema evolution using namespaces is
>   a Known to Be Hard, TAG-level problem.

I'd generalize your observations here to not just encompass
schemas but to all types of validation. Validation seems
anathma to evolvable.

> 
>   If you want to leverage commonly deployed code that understands
>   a specific namespace (XHTML, SVG, etc.), the full-blown Namespaces
>   in XML is your friend, well Real Soon Now anyway.  If you just
>   want to disambiguate tags, it has lots of little gotchas
>   (that "RSS 2.0" seems to have been gotten by!) that make it a 
>   challenge for people who don't grok its subtleties. (MOST OF
>   THE REAL WORLD!!!)

I wouldn't put too much weight on the 'problems' with namespaces
in the RSS 2.0 rollout. Only two home grown aggregators were known
to break. I believe that if you wanted to intentionally generate a 
format today that was 'evolvable' then namespaces would be the
way to go and would not cause problems.
Namespaces aren't overkill today for the tools
available but I do think it was back when RSS 1.0 was published.
This gets back to the 'tools' issue.

> 
> 3 - If you don't know exactly what you're dealing with, heuristics
>   beat logic.  If the tag is  <table>  and it has
>   HTML table elements inside it, it's probably an HTML table!  Don't
>   throw it away because it's in the wrong namespace.

I'd say that "heuristics beats validation".

> 
> I guess this is more of a question:
> 
> 6 - Why on earth would one even THINK about using entity-encoded
> non-well-formed HTML in a syndication format???  Use the HTML
> tags, but close them!  Use tidy to clean up the junk you get
> from your users!  Why fool with any alternative?  Even if you're
> taking the advice in point 5, just "escape it" with an HTML: line label
> or whatever.  Someone downstream will thank you.

This gets into the social aspects of RSS as an 'evolvable' format.
Many of the feeds are produced by some home grown CMS or are
even created by hand. This highlights the need for a 
format to be as simple as possible. 

The other aspect is that many people implementing RSS may not 
have read the RSS spec (never mind the XML spec) they're just 
using an example RSS file as boilerplate. Again, another 'tools'
issue. Paraphrasing a conversation
I had with another developer when he was talking about creating an RSS feed:

"I thought to my self, I could do this the *right* way and use
the DOM API in my scripting language and have it take me an hour,
or I could just use printf and be done in 10 minutes. 
I did the printf thing, it's just a blog."

RSS started out coming from Netscape, where they did validation 
and were using the format for business critical services. But the
format had an appeal to bloggers and beyond and as William Gibson
would say, "The street has it's own uses for things". These are 
my observations from working with RSS. I am not saying
what has happened was right or wrong, just pointing out what happens 
when XML hits the street.

	-joe

--
http://bitworking.org

Follow-Ups:
- Re: [xml-dev] RE: evolvable formats
  - From: Uche Ogbuji <uche.ogbuji@fourthought.com>

References:
- Re: [xml-dev] RE: evolvable formats
  - From: Mike Champion <mc@xegesis.org>

Prev by Date: Re: [xml-dev] Great piece on RSS
Next by Date: Re: RE: RE: [xml-dev] Great piece on RSS
Previous by thread: Re: [xml-dev] RE: evolvable formats
Next by thread: Re: [xml-dev] RE: evolvable formats
Index(es):
- Date
- Thread