Re: [xml-dev] XML Schema generation by machine learning from acorpus of

Michael, you wrote:

"Listing every (grandparent, self, child) and (preceding-sibling, self, following-sibling) triple might work better."

You mean triples of QNames, right? Like:

(parent, self, child): x:company, x:address, x:street

(preceding-sibling, self, following-sibling): x:street, x:zip, x:country

What do you think about slightly changing/generalizing the description language and using RDF triples:

x:company a yogi:nodeName

x:address a yogi:nodeName

x:street a yogi:nodeName

x:company mk:parent x:address

x:address mk:parent x:street

etc., creating a graph. (Here, the prefixes "yogi" and "mk" represents a corpus of documents and an ontology for describing XML data, respectively.)

Perhaps one might approach insights (not necessarily: conventional schemas) via SPARQL? (Possibly supported by OWL and RDF inference; possibly supported by external information merged into the generated triples.). What do you think?

With kind regards,

Hans-Jürgen

PS: The result might be insights to be used also in other ways than validating documents.

Am Mittwoch, 14. März 2018, 00:35:03 MEZ hat Michael Kay <mike@saxonica.com> Folgendes geschrieben:

> On 13 Mar 2018, at 23:03, Rick Jelliffe <rjelliffe@allette.com.au> wrote:
>
> We had a program to generate every absolute XPath found in a corpus, then complain (Schematron) if any Xpath was found that was not in that corpus. It did not test for required elements.
>

Listing every (grandparent, self, child) and (preceding-sibling, self, following-sibling) triple might work better.

Michael Kay
Saxonica

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php