XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Making the Connection between Syntax and Semantics isStill a Black Art

Any useful grammar is isomorphic to "code".

Peter Hunsberger

On Mon, Apr 20, 2015 at 5:05 PM, Costello, Roger L. <costello@mitre.org> wrote:

Hi Folks,

Problem: define a grammar for this simple data format:

                N item1 item2 ... itemN

The first value in the input is a number N which specifies the number of following items.

 

Ouch! That's not context-free. So basically you lose the entire context-free parsing toolkit.

 

Stated another way, that problem requires a connection between syntax and semantics.

 

Alas, we enter the world of black art.

 

Today the syntax-semantics connection is made by writing code. Let's illustrate how the connection is made in three different grammar languages: XML Schema 1.1, ANTLR, and BISON.

XML Schema 1.1

Use the ordinary context-free capabilities of XSD to define N and each item. Then embed XPath code into the XSD to constrain the number of items to N. Here it is:

ANTLR

Create parser and lexer rules. Embed Java code within the parser rules to constrain the number of items to N. Here it is:

BISON

Proceed as with ANTLR, except instead of embedding Java code into the parser rules, embed C code.

We want this: Reusable and Retargetable Grammars

Reusable: multiple different applications can use the same parse tree.

 

Retargetable: multiple different languages can be generated from the same grammar.

 

The requirement for reusable and retargetable grammars is code-free grammars.

Want to Make an Impact on the World?

Today it is nearly impossible to solve any non-trivial language recognition problem without embedding code into the grammar. In the example above we saw that we needed to embed XPath code in XML Schema, Java code in ANTRL, and C code in BISON. And that was for a trivial problem.

 

Want to make a huge impact on the world? Fix this problem. No more embedded code in grammars.

 

Comments?

 

/Roger

 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS