[
Lists Home |
Date Index |
Thread Index
]
At 10:47 AM -0500 1/16/04, Bob Wyman wrote:
> I believe that no matter how strict or liberal a system may be
>in what input from another system it is willing to process or pass on,
>it still must be very "liberal" in ensuring that it can accept a wide
>range of invalid inputs without being damaged by those inputs. (Buffer
>overflows, etc.) Thus, even the strictest, most conservative system
>must first be "liberal" in accepting input before it can take the
>opportunity to determine what it will reject, clean-up, or process as
>received.
Absolutely. In an XML context, a parser does not assume that the
document is well-formed. It checks everything it can possibly check,
and accepts as input any stream of characters, including characters
that are illegal in XML. Most parsers also operate on streams of
bytes and accept absolutely any bytes. The strict nature of XML, and
the attention paid to well-formedness, means that it's relatively
hard to slip in damaging data by violating the assumptions about the
input.
It's certainly possible to send data that the parser vendor did not
anticipate. However, if that data does not match the XML grammar, the
parser will reject it. The very nature of an XML parser is to prove
(almost if not quite mathematically) that a certain sequence of
characters satisfies the grammar. While parsers of other formats are
often implemented by assuming the data is good, XML parsers do not
assume this unless they can prove it. Many parsers for other formats
(as well many fast pseudo-XML parsers that have not been widely
adopted in practice) assume that the data looks like they expect, and
try to read it without actually checking it first. This is one way
security holes arise.
Of course XML parsers, can and do have bugs. However, when they do,
it's very easy to point at the spec and tell the vendor, "Your parser
is buggy. Fix it." When it comes to basic well-formedness checking
the major parsers today have very few if any bugs. The only ones I
can think of off the top of my head all involve parsers being too
strict and rejecting data they should accept, rather than the other
way around.
--
Elliotte Rusty Harold
elharo@metalab.unc.edu
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
|