[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML Universe ?
- From: "HILLMAN, Tomos" <tomos.hillman@oup.com>
- To: Dimitre Novatchev <dnovatchev@gmail.com>, "Liam R. E. Quin" <liam@w3.org>
- Date: Tue, 31 Mar 2015 20:04:11 +0000
On 29/03/2015 20:27, "Dimitre Novatchev" <dnovatchev@gmail.com> wrote:
>>> Has anyone thought about unifying all available XML documents into a
>>>single repository?
>>
>>I fear even just for static documents you'd need more disk space than
>>is easily avaiable :-) I know of people with petabytes of XML.
>>
>>If you include XML documents generated by car engine computers and Web
>>services there's more XML in the world than HTML.
>
>
>Such arguments didn't stop the development of the Internet Archive
>Wayback Machine (https://archive.org/index.php)
>
>So, if they can do it, we can do it, too -- even won't be the first to do
>it.
>
>
>I also thought about the argument that people / companies wouldn't
>want to expose their proprietary data. There could be an anonymizer,
>that takes your XML, and while preserving its exact structure
>(document tree), renames any readable strings to random ones. You run
>the anonymizer locally, so you never transmit your precious data on
>the wire.
True: there are a couple of differences here, though:
1/ The internet is, by its nature publicly available for consumption
2/ The effort to collect this information can be done by the consumer
rather than the publisher; especially relevant if you are proposing adding
an obfustication (value-minus?) process by that publisher.
To me that seems like a big (non-technical) challenge in terms of buy-in
and, if nothing else than in terms of time, expense.
I also think that it would be a shame to lose information about what data
types are held in which tags because of the anonymizer. Of course, that
information often isnšt held in an objective computer-readable way anyway.
I like the idea despite my cynicism - make it happen and Išll try to
persuade OUP to participate - if you manage to get all of our content into
a single query-able database youšll have gotten one up on me! ;)
Tom
Oxford University Press (UK) Disclaimer
This message is confidential. You should not copy it or disclose its contents to anyone. You may use and apply the information for the intended purpose only. OUP does not accept legal responsibility for the contents of this message. Any views or opinions presented are those of the author only and not of OUP. If this email has come to you in error, please delete it, along with any attachments. Please note that OUP may intercept incoming and outgoing email communications.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]