OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] parser models

[ Lists Home | Date Index | Thread Index ]

Amelia A Lewis <amyzing@talsever.com> wrote:

| Building a lexical API on top of a syntactic one is ... backwards. 

Yep.  SAX, e.g., is based on ESIS, which is a syntactic API spec.

| It is perfectly easy to imagine, for instance, LAX: the lexical API for
| XML.  This would have different sorts of events, though.  Perhaps it
| would have "leftPointyBracket()" and "nameCharacters(char [])" and
| "tagWhitespace(char [])" and "attributeValue(char, char [])".

Well, both SGML and XML have lexical specifications (e.g. the ISO 8879
productions http://www.oreilly.com/people/staff/crism/sgmldefs.html and
the productions in the XML spec document).  SGML actually defines things
in terms of an _abstract syntax_.  For instance, a starttag begins with a
STAGO and ends (usually) with a TAGC, in the meanwhile picking up stuff
like names, VI (value indicator), LIT, LITA and the like.  (The delimiters
are bound to  a _concrete syntax_ in the SGML declaration; that's how "<"
is STAGO, "=" is VI, ">", etc.  XML disallows variant concrete syntaxes,
instead fixing the syntax to the bindings of the _Reference Concrete
Syntax_.)  So, it's possible to associate categories with token "events"
and define an API at that level: tokenization only.

| I don't know if an in-memory API corresponding to such a ... lax parse
| (oh, re ... lax.  You knew that was coming, right?) is possible,
| though. 

A push API shouldn't be too difficult.  By in-memory do you mean some
analogue of DOM, where all the tokens are held in a structure of some sort
(like a parse tree)? 




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS