[
Lists Home |
Date Index |
Thread Index
]
On Jan 15, 2004, at 9:04 AM, Elliotte Rusty Harold wrote:
>
> This is also the point of view taken by Walter Perry. However, what
> you're missing here is the assumption (certainly in Walter's case, and
> I think in Uche's and Sean's as well) that the documents are
> well-formed. They are willing to process invalid documents. However,
> well-formedness is their minimum requirement. Although the Atom folks
> frequently confuse their language, what they seem to be asking for is
> the option to pass around malformed documents.
I agree the *Atom*, being designed as it is for Mr. Safe and
presumably produced by XML-conforming tools should be quite strict --
if something claims to be an Atom feed, it should definitely at a very
minimum be well-formed XML. (RSS is another kettle of fish, it seems
broken as designed, and that hasn't stopped its viral spread. Don't
bother trying to cure it, you might kill it.) Atom's value proposition
is that it will (someday) be a real spec, with real rules on how to
produce it and validate it. It's less clear where the truly optimal
place to reject (or optionally fix) a problem is. That's why Gresham's
Law applies -- no individual benefits from rejecting a bad document,
but the system as a whole benefits if bad documents can be kept out.
Atom can start over and build a community with a strong ethic that
everyone should be checking for "counterfeit" Atom feeds. I guess I
should just shut up and let you folks play Enforcer :-) but I'm
skeptical that this will work (for human reasons) -- the net effect
will be to create a "buzz" that Atom is something you don't want to
mess with because you'll get flamed (or spammed <grin>) by geeks who
babble about stuff that you don't care about. In a world where Atom
is stillborn, this whole discussion is moot.
My major point here is that simply rejecting bad data is not a great
option for any single actor in the system, and there's no global
enforcement mechanism, so services (code, SOAP-y, RESTful, whatever)
that fix bad data and make it real XML are a Good Thing -- everyone
downstream gets the advantages of XML, the original creators (who we
value for what they have to say, not their choice of software tools)
aren't stifled by the necessity of understanding the details of utf-8
vs iso-8859 encoding of the characters their authoring tools produce.
Ideally these services get invoked before the data is even serialized,
but invoking them anywhere far upstream can work too. The downside is
that it is *possible* that sometimes the fixes could distort the
meaning of truly horribly broken stuff that the fixer tries too hard to
clean up. For the domain of weblog syndication, it's hard to get too
excited about this problem. For the domain of data feeds tunnelled
through Atom, this is a real issue, and the *option* to track and
reject data that has been "fixed" is necessary. That seems like a more
productive and politically viable approach than saying Thou Shalt Not
Process Malformed XML, Ever.
|