OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: Ambiguity in XML spec

[ Lists Home | Date Index | Thread Index ]
  • From: Steve Schafer <pandeng@telepath.com>
  • To: xml-dev@xml.org
  • Date: Thu, 18 May 2000 11:48:13 -0500

On Thu, 18 May 2000 16:45:50 +0200, "David Brownell"
<david-b@pacbell.net> wrote:

>It's pretty common for language specs to ensure that their grammars
>can easily be handled by parser generators -- commonly they'll be
>done as LALR(1) [yacc/bison/...] or somesuch.

If I recall correctly from my experimentation in that area, the XML
grammar is already LALR(1) *if* you consider only the
non-regular-expression productions (the ones that start with a
lower-case letter). That assumes that you have a lexical analyzer that
can generate all of the other productions as tokens. And that turns
out to be the tricky part, because SGML/XML is highly
context-sensitive when it comes to deciding whether a given character
is a delimiter or not.

You end up with the equivalent of a half-dozen or so separate lexical
analyzers, and at any given point during the parse you invoke one of
them, according to context. It's all rather messy and you pretty much
lose all of the benefits of a lex/yacc-style table-driven approach.

-Steve Schafer

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS