[
Lists Home |
Date Index |
Thread Index
]
- From: David Brownell <david-b@pacbell.net>
- To: Paul Tchistopolskii <paul@qub.com>, Eric van der Vlist <vdv@dyomedea.com>,xml-dev@lists.xml.org
- Date: Wed, 06 Dec 2000 08:53:30 +0000
> > Which productions -- the lexical ones, or the grammatical ones? I count
> > two layers there. (Evidently from its SGML heritage, XML doesn't have
> > the cleanest of distinctions between those layers, but it exists.) The
> > SAX API is basically a grammatical layer.
>
> Sorry for side-effect, but why do you, people, call SAX API a 'parser' or
> 'grammatical layer' ?
"Parser" is a word that's fuzzy around the edges, but it always
involves processing according to some grammar. Textbooks
about things like compilers will talk about such issues.
> In the existanse of yacc and lex - I think SAX API is a lexer.
> It returns lexems. Tokens.
A lexer returns tokens in order -- all of them. You'd see "%foo;" be
reported, and never interpreted. In no way is SAX a lexical API;
it provides syntactic interpretation ("start element" etc).
Parsing builds some higher level model out of token streams.
Something like YACC is irrelevant for XML, since the model
inherited from SGML isn't well-enough factored; you can't
use such tools, there are too many funky special cases.
For one XML example, turning "%foo;" into a stream of tokens
(that fudges some nastiness, note!) in DTDs (a context defined
by the parser) or passing it through unaltered (outside DTD)
is done inside the parser.
It's long been known that a SAX2 extension exposing lexical
events could be defined ... but nobody's been motivated to
work on one, so far as I know, since so few applications need
to see that kind of data.
- Dave
|