OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Postel's "Law": A question for liberal parsers

[ Lists Home | Date Index | Thread Index ]

Bob,

I post to this forum only infrequently, so please forgive me for butting 
in. I think some reactions to the entire thread bubbled over when I read 
your recent post.

Despite your protests, and my not knowing a thing about you or about most 
other participants in the discussion, I think there may be more in common 
between you and Walter Perry than you've admitted to so far. Both of you 
are mediators -- maybe even arbitrageurs, to borrow from Walter -- who seek 
to fill a niche (and perhaps make a living in return) by concentrating a 
body of knowledge at a point, leveraging what you know to provide for 
others something that is too costly, or too difficult, for them to obtain 
for themselves unaided. From what you've said, yours is knowledge about the 
feeds -- about where they are, and how they work -- which you aggregate and 
analyze for your customers. Your product is valuable to them because to get 
a combination of broad access (maintenance of which is a chore) with 
breakdowns and breakouts (the indexing you provide), would otherwise 
require both hours of labor, and considerable expertise -- to say nothing 
of will or drive. Very valuable.

But difficult -- not least because of the "ownership" issue that troubles 
you. Who owns the feeds, who is responsible for them? How much 
responsibility do you take, as an intermediary? In fact, I think you have 
put your finger on the exact nub of the matter. This is a matter of 
Aristotelian warrants. Postel's Law is being invoked on the one hand, out 
of context -- as if by saying "be liberal" Postel were arguing that users 
of the protocol he is defining can go ahead and break the rules, what the 
heck, because being liberal and forgiving in what you accept is the rule. 
But Postel (as has been pointed out) doesn't fairly warrant this: his 
assumption is that users will be following the explicit rules. Only where 
the rules fail to be perfectly clear, he adds (in a metastatement that is 
not, after all, a specification of a protocol but rather -- in the 
"Philosophy" section -- a hopeful instruction to his readers about how to 
behave), they should be "conservative ... and liberal". Note *both* 
conservative and liberal. To suggest that he's therefore licensing 
rule-breaking (whatever the rule may be) is to miss how he's simultaneously 
insisting on conformity ("be conservative in what you do"). The Law is a 
paradox.

To be fair, in its larger context -- that is, *if* you assume all parties 
are scrupulously and respectfully following the rules given (to whatever 
extent those rules are clear -- maybe "Postel's Law" is a hedge to cover 
those cases where it is the rulegiver who has failed?) -- it isn't a 
paradox at all, but rather a simple restatement of the Golden Rule. Yeah -- 
be nice, guys. If you can't agree with the other party on what the rule is, 
then maybe *both* should step up, make allowances, since that's what you'd 
want from the other guy. As an ethical principle this is both ancient and 
sound. If we all followed it, the world would be a better place. "Our 
discipline is strict, so that life will be easy" (a Sufi saying, IIRC).

Yet as you know, it's a more difficult rule to practice than it is to 
pronounce; and even when everybody tries to follow it, there can be 
missteps: such is the nature of human (and hence machine) imperfection. 
(Let's not fall into the trap of imagining that machines are perfect and 
pristine, that because some routine is formalized and automated through 
code, it is therefore above failing and beyond fixing -- no, this is more 
like the Sorcerer's Apprentice. We are not talking about software 
applications as they exist in the conception of their designer's brains, 
where they can do no wrong. Software in the real world is both buggy, and 
plain wrongheaded; is expressive of learning processes -- *human* learning 
processes -- that are often young, halting or bone-headed, or choosing hard 
ways over easy ones.)

Hence the need for mediators, steppers-between, arbitrageurs. An honorable 
profession, if a shady one. But however much they know about rule-breaking 
or exploiting the in-between spaces, the good mediators know the boundaries 
are there for a reason, and the rules are a good thing.

As it applies to XML, ironically, Postel's Law would be difficult, even 
impossible, to observe if it were not exactly for the relatively stark 
clarity afforded by the definition of XML well-formedness. (That allegation 
that XML's creators "broke Postel's Law" misses the point that Postel was 
writing a *specification*, after all.) Imagine if XML were woolier and 
wafflier than it is, that its corner cases had not been fully explored and 
discoverable in the archives of this list. What would being liberal or 
conservative mean then? Among the endless debates over what was in and what 
was out, the horn would be sounded for being liberal in what you accept 
from others, and for all to play, we would have to accept more and more. 
Soon enough we would have bloatware and vendor lock-in (hey, where have we 
seen that?). Far from a wise counsel that we should work together, Postel's 
Law would be a recipe for disaster, if the hawks could not keep insisting 
that "if it's not well-formed, it's not XML". Well-formedness, however 
bizarre or arbitrary it may seem in some respects (no unescaped less-than 
signs in attribute values! no slashes in tag names!) is not just a 
religion, it is the placing of a boundary. If it parses as XML, well good, 
go ahead and move on. If it doesn't, all bets are off. Not a threat, but 
simply a statement of fact, that stuff that doesn't lie about what 
character encoding it uses (to use an example actually cited), is going to 
be more predictable and less troublesome in general than stuff that does.

There may well be a role for mediating even these clear boundaries, as 
Michael Champion suggests: help out by making not-quite-XML into XML. To be 
successful (assuming there is enough busted XML around that people couldn't 
be bothered to fix), this service would not have to be right or perfect 
(this is the work of human interpreters, and is necessarily tricky -- 
Hermes not Apollo), merely earnest enough, and skilled enough, to sustain a 
stream of work from those who know it to be worth their pile of 
micropayments. A kind of benign arbitrage. Most of the time, Mike (a newly 
minted millionaire) just fixes up XML declarations and replaces errant 
ampersands. It doesn't cost him much, because his software is so smart. But 
once in a while, something like Tim's anti-XML comes his way. What will he 
do then? Hopefully, follow the Golden Rule (including being conservative in 
what you do). Hopefully, go out of band and raise a question on a different 
channel, make an effort to resolve the problem properly in its real-world 
context, not just guess or flip a coin.

Just so, in your role as mediator between the feed-reading masses and the 
net's flora and fauna -- a virtual supermarket for online tasties -- you do 
not simply have to take the bad stuff and pass it along silently, as if you 
were guiltily putting mislabelled snacks on the shelves. You can speak up. 
Others have already suggested how it could be done -- a red X on a GUI 
makes a statement -- and if there's any means of feedback, all the players 
will be motivated to come into line with each other, over time, first 
making your job easy, then eventually making it perhaps unnecessary, at 
which point you can go on to make your further millions doing something 
else. In the meantime, whether you want that red X on not-quite-XML, 
coddling your users of the moment at the cost of further interoperation 
down the line (or worse when the anti-XML comes along) -- or whether you 
want it only on stuff that's well-formed, if invalid (perhaps passing the 
non-XML back to its Postel's Law-breaking producer), is a business and 
social decision, not a technical one.

Nor do I see any reason to think Walter is doing much different.

Interested readers might want to take a look at Michael Sperberg-McQueen's 
keynote at Extreme 2002: 
http://www.mulberrytech.com/Extreme/Proceedings/html/2002/CMSMcQ02/EML2002CMSMcQ02.html
"if network effects are the best predictor, then we must infer that the 
people who actually are responsible for making a good decision are the 
early adopters. In IT, that means you. You have a responsibility to judge 
what matters not by network effects but by technical merit."

Regards,
Wendell


At 06:03 PM 1/15/2004, you wrote:
>         It may work for Walter to claim that the data is his and not
>someone else's but it isn't ok for at pubsub.com . The function we're
>providing is content-based routing -- we're not doing re-publishing,
>data mining, or other stuff (yet). Thus, when we pass data to you,
>we're claiming that it *is* someone else's data -- it just got to you
>by flowing through our content-based PubSub routers. We don't claim
>the data and we take no responsibility for it (within the limits of
>applicable law...)



======================================================================
Wendell Piez                            mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
   Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS