XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Which algorithm do XML Schema validators use to decideif a string matches a regular expression?

On 26/01/2011 15:56, Costello, Roger L. wrote:
> Hi Folks,
>
> It is my understanding that there are 3 flavors of regular expression parsers [1]:
>
> 1. Nondeterministic Finite Automaton (NFA)
>
> 2. Deterministic Finite Automaton (DFA)
>
> 3. Backtracking
>
> Which flavor of regular expression parser does SAXON use?
>

Saxon uses the DFA algorithm described in

http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html

modified by a system of counters to handle minOccurs/maxOccurs 
constraints, which is inspired by subsequent work by Thompson and Tobin:

http://www.cogsci.ed.ac.uk/~ht/XTech_2006_paper.pdf

but does not follow it slavishly.

I'm not sure about your three categories, by the way. I think that if 
you use an NFA then you need some kind of backtracking (either that or 
you investigate multiple forwards paths in parallel, which amounts to 
the same thing.)

Michael Kay
Saxonica


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS