OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: [xml-dev] Exploiting multi-core CPUs during XML parsing

[ Lists Home | Date Index | Thread Index ]

> Sounds a nice idea independent of any speed benefits. I toyed with the idea
> of parsing XSLT match pattterns backwards at one stage, but didn't pursue
> it. My brain is wired for left-to-right reading...
> 
> As a matter of interest, is there any problem with decoding UTF-8 when
> reading backwards?
> 
> Shame that the spec doesn't require ">" to be escaped, I imagine this causes
> a fair problem with backtracking, for example how do you cope with
> 
> ...... <a><b/><c/></a>  -->
> 
> which might or might not be part of a comment?
> 
> IIRC we were able to read files backwards on ICL VME. That's history, but
> many of its features have reappeared in Windows 25 years later ...

I wrote a reverse parser in pascal once as part of an editor project. It 
was designed to determine the current context the user was working in. 
In general from a fixed space somewhere in the middle of a document the 
amount of branching and caching you had to do was excessive and in 
virtually all cases led to scans back to the start of the document. In 
Sean's case I imagine there would still be a lot of assumptions on the 
parser's part until it reached the crossover position between the two 
threads at which point it might issue a fatal error (if the assumptions 
turned out to be wrong).

I wonder if you could instead work on a sequential multi-threaded 
approach. 1 to handle decoding and chunking of characters another one to 
handle parsing of lexical structures and a third (possibly at a driver 
level) to handle external WF checks (like checking character classes, 
name checks, and duplicate attribute checks). Decoupling these pieces 
would allow you to very easily turn off WF checks if you knew that the 
document was WF via an out-of-band mechanism.


Cheers,
Jeff Rafter




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS