OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] English sentences, was: Re: [xml-dev] Announce: XMLSchema,

[ Lists Home | Date Index | Thread Index ]

Eric van der Vlist wrote:
> On Thu, 2002-06-27 at 13:30, Jonathan Borden wrote:
> > Recognizing and processing natural language is something that's been
> > for a couple of decades -- albeit imperfectly -- and as I am sure you
> > aware, the grammar(s) are complicted -- what is generally needed is some
> > notion of the intended semantics of the sentences. In any case, this
> > isn't a good use case for XML schema languages and 'validity'.
> No, but it is a good use case for extensibility in XML schema languages.
> If you are happy with the result of the unix "file" command to determine
> the type of a text and see if it's more likely a Java source code, a
> snippet of Python or an English text, you may want to validate the
> document using its result instead of the code.

I presume that both Java and Python can be unambiguously determined via EBNF
or perhaps plain 'ol regular expressions, and that sort of endevour is a
good use case for schema extensibility -- err, though I was brought to
believe that the _whole point_ of XML is that such structural information
would be explicitly labelled. It's just that _reliable_ detection and
classification of human languages is a bit more difficult. It has been done
for a long long time (certain government agencies tend to spend unlimited
amounts of funds on such projects) and its problems are relatively well
characterized. As a _start_ in that direction take a look at _ontologies_



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS