[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] XML Universe ?
- From: "Liam R. E. Quin" <liam@w3.org>
- To: Dimitre Novatchev <dnovatchev@gmail.com>
- Date: Sun, 29 Mar 2015 15:13:19 -0400
On Sun, 2015-03-29 at 11:17 -0700, Dimitre Novatchev wrote:
> Has anyone thought about unifying all available XML documents into a
> single repository?
I fear even just for static documents you'd need more disk space than
is easily avaiable :-) I know of people with petabytes of XML.
If you include XML documents generated by car engine computers and Web
services there's more XML in the world than HTML.
> This would provide many benefits to XML practitioners, and in
> general could be used as an "XML data-warehouse" and allow BI for
> querying and acquiring interesting and unknown facts about XML.
Someone I think in Amsterdam made a collection of a few Web documents,
I think just 10 gigabytes or something; I looked at it in some detail
and even did a Balisage paper about this, because there was talk going
round that said that it showed a high proportion of XML on the Web
wasn't well-formed. It turned out that if you handled the document
encoding properly the documents were almost all just fine. But at any
rate that collection was on the Web last time I checked.
Liam
>
> Examples of such queries:
>
> 1. What is the maximum depth of any known XML document?
>
> 2. What is the maximum number of different element/attribute names
> of any known XML document?
>
> 3. What is the maximum length of element/attribute names in any
> known XML document?
>
> 4. What are all namespaces used and what are they in sorted order
> by
> frequency of being referenced?
>
> 5. What is the longest chain (length of chain) of XInclude
> references?
>
> 6. What are all XPath expressions used in all available XSLT
> modules
> (XSLT is a kind of XML) -- and a variety of questions about the
> complexity and syntax structure of these expressions.
>
> 7. Similar to the above, but for XSD
>
> etc, ..., etc.
>
> Among other benefits, such a repository would provide for real-world
> XML test data, when writing tests for a new XML processing
> application.
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]