[
Lists Home |
Date Index |
Thread Index
]
>
>> On the standard textual XML front: As has been noted, Xerces and
>> woodstox can be made to run quite fast, but in practise, few
>> people know how do configure them accordingly, and to do so
>> reliably, and without conformance compromises.
>
> A red herring. Xerces' defaults are an issue unrelated to the
> merits of stimulating software developers to use modern C++
> features instead of sticking to slow 90's features.
>
> (In any case, these optimisations are potentially also applicable
> to binary XML parsing as well as to real XML processing.)
>
>> Most users can't afford to study the complex reliability vs.
>> performance interactions of myriads of more or less static tuning
>> knobs.
>
> Same fish.
Fish or not, it reflects the priorities of reality. It's not good
enough to just provide low-level infrastructure regardless of
usability concerns.
The bottom line is that more often than not, parser performance
problems are a result of folks using the parser with inappropriate
configuration. Why? Because typically the APIs are a huge complex
mess, designed with little respect for clarity and performance in
mind. As a user, how do I cache DTDs or schemas? How can I safely
reuse parser data structures in efficient, thread-safe, memory
bounded manners? How do I deal with parsers implementing poorly
specified ambigous "standard" interfaces in varying manners, in more
or less subtle ways. If it isn't obvious how to take full advantage
of a parser's theoretic performance capabilities, it mostly won't
happen in reality, and no amount of internal SSE optimizations will
change that.
A new performance oriented parser implementation must come with a
straightforward API, or else it will matter little in practise.
Wolfgang.
|