OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SAX question



Judi Thomson wrote:
> [snip]
> My task is to take a large XML file with multiple occurences of a specific
kind of element.
> I need to build an index that lets me retrieve instances of that element
from the file
> (at a later date) very quickly.
>
> My thought for this was to use the locator in a SAX parser to identify the
location
> of the start of the element tag in the file, record that location
persistently and then
> use the persistent location later on to get my element (and its contents)
back quickly.
>
> My questions are:
>
> Is there a SAX implementation for which the locator is known to be
accurate?
> I noted somewhere (in the SAX specification?) that the locator is kind of
intended
> for debugging more than document  manipulation.

I think the main trouble you will have using the SAX approach is that the
locator
will not have the value you require to build an index.  According to the SAX
spcification
"If possible, the SAX driver should provide the line position of the first
character
after the text associated with the document event"

So, for example, for a startElement() event the locator is likely to point
at the first character
of element content - rather than the first character of the element start
tag.  If you are planning
on using this index position to later parse the fragment you will obviously
need the
position at the start of the tag.

> (and if it is written in c++ that makes my life MUCH easier)
Oh good, somebody else using my favorite language!

Regards
Rob Lugt
ElCel Technology