[
Lists Home |
Date Index |
Thread Index
]
- From: Steve Schafer <pandeng@telepath.com>
- To: xml-dev@xml.org
- Date: Thu, 18 May 2000 11:48:13 -0500
On Thu, 18 May 2000 16:45:50 +0200, "David Brownell"
<david-b@pacbell.net> wrote:
>It's pretty common for language specs to ensure that their grammars
>can easily be handled by parser generators -- commonly they'll be
>done as LALR(1) [yacc/bison/...] or somesuch.
If I recall correctly from my experimentation in that area, the XML
grammar is already LALR(1) *if* you consider only the
non-regular-expression productions (the ones that start with a
lower-case letter). That assumes that you have a lexical analyzer that
can generate all of the other productions as tokens. And that turns
out to be the tricky part, because SGML/XML is highly
context-sensitive when it comes to deciding whether a given character
is a delimiter or not.
You end up with the equivalent of a half-dozen or so separate lexical
analyzers, and at any given point during the parse you invoke one of
them, according to context. It's all rather messy and you pretty much
lose all of the benefits of a lex/yacc-style table-driven approach.
-Steve Schafer
***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
|