[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML data sets with (known) data quality problems
- From: Andrew Welch <andrew.j.welch@gmail.com>
- To: Helena Galhardas <helena.galhardas@ist.utl.pt>
- Date: Mon, 6 Feb 2012 14:02:33 +0000
> In order to test exhaustively this library, we need to have XML data sets
> that have data quality problems known a priori.
> By data quality problems, we mean: missing values, misspellings, synonyms,
> values out of domain, approximate duplicates, etc.
Government data: http://data.gov.uk/data
I did a short contract for 'LinkedGov' a while back
(http://linkedgov.org/), it's their goal to make the data clean and
usable, so you might want to get in touch with them.
--
Andrew Welch
http://andrewjwelch.com
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]