Lists Home |
Date Index |
Daniel Veillard wrote:
> Right, I think it's time to bite. If you want high speed parsing
> don't use entities, okay. This mean don't use entities *in the instances*.
In a Web Service environment, the server has no control
over whether or not incoming documents use entity references.
And there's a much bigger issue than performance:
entity references in the instance require the parser
to do I/O. That's a security hole waiting to happen,
no matter how careful you are.
> This is pure non-sense. Now, about being afraid of recursive entities
> references in the internal subset leading to possible DoS, first
> your service is still vulnerable to DoS with infinite input or high
> rates of input request/data in completely similar ways, second if the
> recursion frighten you simple put a guard for the depth of the recursion
> like I did in libxml2, nobody ever complained about it and such recursion
> is immediately detected and the parser halts with an error.
There's a bigger issue than the billion laughs attack.
Sure, you can limit the recursion depth to prevent this attack,
and you can run the parser/entity manager in a sandbox to prevent
access to local files and outgoing network connections, and
you can follow xml-dev and the CERT mailing lists to stay
informed of any new entity-related attacks that someone
Or you could use a parser for an XML subset that doesn't
support entity references to begin with. That's what I'd do.
> I seriously think that at least those 2 arguments don't stand in the
> face of real code engineering, especially if you follow basic good
> coding practises. It doesn't mean that defining one subset of XML
> might not be a worthy exercise, but those justifications are IMHO
> totally impropers to guide this work.
In a Web Service implementation, it's not a matter of
"if you don't need the feature then don't use it".
For entity references, it's "don't need them, don't
want them, must not use them". Good coding practice
tells me that there are no bugs or security holes in
code that isn't there, so that's what I'd do.
That said, I also believe that defining Yet Another XML
subset is a waste of time. Not because it isn't needed,
but because it's already been done:
<URL: http://www.textuality.com/xml/xmlSW.html >
and Tim Bray has done a better job of it than anyone else
can hope to.