[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Schemaless XML?
- From: dal <dalapeyre@mulberrytech.com>
- To: Michael Kay <mike@saxonica.com>
- Date: Wed, 12 Oct 2016 11:10:56 -0400
> On Oct 12, 2016, at 4:23 AM, Michael Kay <mike@saxonica.com> wrote:
>
> In trying to form an understanding of a large mass of undocumented XML content, I have sometimes found it useful to derive a schema. It's not that the schema contains any information that wasn't in the data; it's just that it provides a distillation that may be more tractable: it tells you what elements are present and how they relate.
I agree completely. Mulberry does the same. A client sends
us a batch of XML, maybe with a schema, maybe not. Even
if it says it uses a schema, we derive one to see what has
actually been done.
Is this a useful schema for authoring or any of the purposes
Eliot mentioned? Of course not. But it is very useful to
determine and communicate what is in this mess-of-XML for real.
For example, if you are doing an XML-to-XML conversion, it
could take a very long time to translate ALL of one schema
to ALL of another, if it is even possible. But if you can say
that this entire branch has never been used (or has been used
3 times in 250,000 document), things get simpler.
Or during a data analysis, if the client says “oh, that never
happens”, it is useful to be able to say, “well, there are 150
of them in your current 4 million document database”, are
these real or errors?
Or just to point out that there are 2 very reasonable ways to
tag structure X, and they seem to have used both. Historical
accident? Mistake? Two varying situations that might be clarified?
Sorry to state the obvious, but schemas as so very useful.
Not essential, just very very useful.
—Debbie
================================================================
Deborah A Lapeyre mailto:dalapeyre@mulberrytech.com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Phone: 301-315-9631 (USA)
Suite 207 Fax: 301-315-8385
Rockville, MD 20850
----------------------------------------------------------------
Mulberry Technologies: Consultancy for XML, XSLT, and Schematron
================================================================
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]