OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Postel's "Law": A question for liberal parsers

[ Lists Home | Date Index | Thread Index ]

This is exactly the situation Walter Perry has been talking about for 
several years, and pretty much what he's had to deal with in the 
financial industry with financial data. As I understand it, his 
approach is to take all the incoming feeds, use them to populate his 
data structures, and then create an XML representation of his data 
which he sends out. In this approach the XML you present to the world 
claims to be nothing more than your representation of your data. It 
does not purport to be a representation of somebody else's data. I 
tend to agree with this.

I say when you receive invalid (not malformed but invalid) data, 
clean it up as best you can. Most of the data can be cleaned 
automatically because as you've noticed people keep making the same 
mistakes.  Flag data that can't be cleaned and pass it to a human to 
write the code to clean it. The first few weeks you'll be cleaning a 
lot of data by hand, but gradually the processes become more and more 
automated, and the exceptional cases decrease to a manageable level. 
Once you've cleaned the data sufficiently to generate your own data 
structures from it, then output those data structures as XML and pass 
them on to the third parties.

But you really should have a sit down with Walter. This really is 
exactly what he's been doing for some time now.
-- 

   Elliotte Rusty Harold
   elharo@metalab.unc.edu
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS