XQuery needs some serious extensions if you want to do what Helena did in her PhD….(BTW, I was working with her when I wrote Quilt with Don Chamberlin… so can see some similarities ..)Two major extensions would be:1. FLWOR doesn’t stop when there is an exception, but just logs the exception and moves on2. Grouby has to be extended from a simple hash to a more general clustering algorithmDanaOn Jul 1, 2015, at 3:35 PM, Ihe Onwuka <ihe.onwuka@gmail.com> wrote:On Wed, Jul 1, 2015 at 2:59 PM, daniela florescu <dflorescu@me.com> wrote:Ihe,transforming XQuery to be able to do data cleaning has been a LONG desire of mine.The problem articulated in the paper with Citeseer publications is similar to the issues I face, for movies there are additional weapons that can be brought to bear because actors, directors and movie titles all have several aliases documented on various sites. That said the problem with movies may be harder because the incidence of two different papers sharing the same title is probably relatively low.Reading on.....