Lists Home |
Date Index |
Someone wrote doubting that parsers that accept non-WF XML really exist,
but we use a couple of them in our editor: we have to, to allow import, indenting,
auto-correction and so on.
For a paper on one of the approaches, "Editor's Concrete Syntax", see
Also, several years ago I wrote a C parser for a language STAX that did the
same kind of minimization: the rationale being that compression might be too
much processing for some uses. It is floating about somewhere too.
This kind of thing is fine, as long as we don't call them XML parsers.
Really, how much cost is it to skip PIs or over a DOCTYPE declaration
in a parser? Just a handful of extra states or rows in a state table. I think
it is fine for a profile to say "don't use comments" but another thing to
write a parser in which comments break your parse: there are lots of
legitimate reasons why you might want to continue to parse a bad document
(error recovery, reporting, repair) but almost none for failing on a good