OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: XML parser using lex & yacc

[ Lists Home | Date Index | Thread Index ]
  • From: Richard Tobin <richard@cogsci.ed.ac.uk>
  • To: "Alastair Sumner" <als2000@postmaster.co.uk>, xml-dev@ic.ac.uk
  • Date: Wed, 1 Sep 1999 16:47:51 +0100

> I want to develop an XML parser in C or maybe C++ for an
> undergraduate university project. My approach will be to prototype
> the parser using flex and bison. As I understand it, flex won't be
> able to handle all of the character encodings required in the the
> 1.0 spec.

Using your own lexer may be the best approach, but all the "syntax
characters" of XML are plain ASCII, so it might well be possible to
use [f]lex to tokenise it.  For UTF-8 it is straightforward: the lexer
doesn't have to even know that the multibyte-characters are not just
multiple characters - the next level up can translate them.

Or you might be able to replace the lexer's input functions and change
its character type to integer (if it isn't already); this would work
for UTF-16 (the other required encoding) too.

The most obvious problem with using yacc/lex type tools for XML is
that keywords aren't always keywords.  For example, in some places
in the DTD "SYSTEM" is a keyword and in others it would just be
a name.  You can have the parser switch the lexer between states
but it's not pretty.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS