[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Schemaless XML?
- From: Eliot Kimber <ekimber@contrext.com>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Tue, 11 Oct 2016 09:33:48 -0400
The question as provided presumes some intrinsic value to the use of
schemas. But there is none.
Schemas are a convenience that enable useful things: validation (when
validation is a requirement), directed authoring (when content is
authored), defaulting of attributes (when omitting attributes is
convenient for some reason), imposing more precise datatyping on
attributes and content (which doing so is useful).
But none of these values are either mandatory or all that interesting in
the general context of processing heterogeneous content that will require
some degree of manual configuration to be able to understand and process
usefully. That configuration cannot, in the general case, be provided by
any type of XML grammar, so there will always be the need for non-grammar
configuration, e.g., custom code, queries, tool-specific declarative
configuration, mapping declarations, whatever it is.
So it's hard to see how, in the scenario presented grammars of any sort
are a compelling part of the solution. And if by "schema" you specifically
mean XSDs, then absolutely no, not needed, not wanted, not going to add
significant value in this context.
Cheers,
Eliot
--
Eliot Kimber
http://contrext.com
On 10/11/16, 9:23 AM, "Thomas Passin" <list1@tompassin.net> wrote:
>What gives you the idea that knowing a schema allows you to understand
>and "process" the data contained in a document?
>
>In every toy example you have given for these kind of questions, you
>have used names (element or attribute) that suggest something to humans.
> I think you are fooling yourself that automated machine processing
>would know what to do with them just because *you* think you know e.g.,
>what "temperature" or "type='graduation'" means.
>
>TomP
>
>On 10/11/2016 8:53 AM, Costello, Roger L. wrote:
>> Hi Folks,
>>
>> Scenario: You are building an application that receives XML documents
>> from various sources. The kinds of data in the XML documents are varied.
>> The XML documents themselves are structured in various ways. Over time,
>> new XML documents are received, containing new, unanticipated kinds of
>>data.
>>
>> How will your application handle such diversity?
>>
>> One approach is to create an XML Schema that models all the various
>> kinds of XML documents that will be received. When the application needs
>> to process new XML documents, the XSD is updated. The disadvantage of
>> this approach is that the processing of the new XML documents will be
>> delayed as the XSD is updated and as the application is updated to
>> handle the new data. The advantage of this approach is that the
>> application knows exactly what the data is and can process it
>>efficiently.
>>
>> An alternate approach is for the application to go “schemaless.” The
>> application performs machine learning on the data it receives. I’m not
>> sure what “machine learning on the data” means. I suspect that it means
>> that an internal schema (in some form or another) is dynamically
>> generated. Do you agree? If so, then the approach is not actually
>> schemaless; rather, there is a dynamically generated schema. Do you
>> agree? Is machine learning technology sufficiently advanced that it can
>> classify and understand the data to the same degree as a carefully
>> crafted schema and carefully crafted application code? Have you gone
>> schemaless?
>>
>> /Roger
>>
>
>
>_______________________________________________________________________
>
>XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>to support XML implementation and development. To minimize
>spam in the archives, you must subscribe before posting.
>
>[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>subscribe: xml-dev-subscribe@lists.xml.org
>List archive: http://lists.xml.org/archives/xml-dev/
>List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]