OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
[ANNOUNCE] Feature Grammars: a tool for feature extraction andrepresentation in XML

Feature Grammars is a tool or technology for feature extraction and representation in XML.

It has a few novel (are they?) features:

* It allows grammar-like modeling of the hierarchical feature set combinations in the document, and the reporting as a feature tree (rather than a feature vector) that says "we looked for this feature because we found that feature"/

* It uses XPath for feature detection.

* It supports feature detection in multiple document types, with alternative detectors for each feature attaching to the same grammar.

It follows the same architecture as Schematron, and indeed could be used to preprocess documents for Schematron or to post-process the SVRL. However, it is not intended as schema language as such. The open-source proof-of-concept prototype implementation is at:
Feedback and improvements welcome!

It grew out of some years of thinking about remaining gaps with Schematron and XML query languages, and some experience with very large corpuses where dozens of different data sources (indeed, scores of sources, over time) fed very different documents to be converted to a  common kitchen-sink transitional DTD: having a common schema made it look like it had been reduced to an N:1 problem, but this disguised that the documents were clustered into, in effect, different discrete languages depending on their source.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS