OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Postel's law, exceptions

[ Lists Home | Date Index | Thread Index ]


On Jan 13, 2004, at 3:26 PM, Simon St.Laurent wrote:

> This has come up before, but the chant "Postel's law has no exceptions"
> seems to be coming again, in the RSS context.

Is this really about Postel's law or pushback on the overly draconian 
(in some opinions) XML spec?   Or is this a typical weblog/RSS 
community bashathon  because (ahem) one of the vocal proponents of 
being conservative apparently wrote software that doesn't  actually 
conform to the XML spec?

> http://www.intertwingly.net/blog/1685.htm

OK, I'll bite and expose my lack of attention to the details  -- why 
are smart quotes illegal in XML?  (Or is it that the encoding is 
mis-specified?) Was this proposed to be fixed in XML 1.1?  Does anyone 
outside the RSS/Atom world complain about this?

The  larger issue seems to be Gresham's Law ("bad money drives out 
good").  I know that as a *consumer* of aggregation tools, I don't care 
a whit whether the input is raw text, tag soup HTML, XHTML, valid 
instances of one of several flavors of RSS/Atom, or what -- I just want 
to quickly see what has changed on the set of  web resources I'm 
interested in that use some chronological layout convention (news, RSS 
feed, email archive, whatever).  I use the most liberal tools I can 
find.  If being liberal is inconvenient for the aggregator developer, 
I'll just find another aggregator or hack up something that does what I 
want.  The last thing on earth I want to do is whine at, for example,  
some poor woman in Bagdhad about the format of her weblog; I want to 
hear what she has to say.  (FWIW, Bloglines apparently uses something 
called sitescooper to enable this).

That's not to say that the specs should condone violations; the whole 
point of Atom is to provide an authoritative spec that is build on real 
standards such as XML and written in such a way as to allow it to be 
implemented from the spec itself rather than having to ask the 
community or a committee.  If the community of weblog software 
developers gets its act together and the number of ill-formed feeds 
becomes vanishingly small, great for everyone -- I can read anything I 
want in any product I choose. But that happy situation won't come about 
by market forces, it will come about by some sort of coercion (moral, 
economic, legal, etc.). If there's real money that falls on the floor 
due to the chaos, someone will come along and clean it up, i.e. make 
Dodge City safe for the banks and railroads.

That might happen by coming up with a common standard and enforcing it; 
my guess, however,  is that "text mining" technologies will make the 
whole question moot by using smarter software rather than insisting on 
more rigid data.  (See IBM's immense investment in WebFountain, for 
example)  Real Soon Now we won't have to care whether text is raw 
email, XHTML, valid Atom, or tag soup to consume it selectively in what 
we now call aggregators.  If software can tag data, or at least extract 
the implicit structure of data and emit it in a valid markup syntax, 
then not even the geekiest of markup or syndication geeks will care 
about missing closing tags or smart quotes anymore.  In other words, 
we'll see "Postel Machines" that liberally take tag soup and raw text, 
and emit conservatively structured data or valid markup to make life 
easier for the downstream processors.





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS