[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] RE: XML parsers use what computational power?
- From: Liam R E Quin <liam@w3.org>
- To: "Costello, Roger L." <costello@mitre.org>
- Date: Wed, 10 Apr 2013 17:29:52 -0400
On Wed, 2013-04-10 at 09:43 +0000, Costello, Roger L. wrote:
> Hi Folks,
>
> Resolving these entities:
>
> <!ENTITY ha1 "ha">
> <!ENTITY ha2 "&ha1;&ha1;">
> ...
> <!ENTITY ha128 "&ha127;&ha127;">
>
> requires an amount of memory that is exponential to the number of entities.
>
> Therefore an XML parser requires more computational power than a linear bounded automata [1].
That does not follow at all.
Note also that any language with string concatenation has the same issue
- e.g. JavaaScript:
var boy = "Simon";
var twoboys = boy + boy;
var four = towboys + twoboys;
var eight .... and so on
But the algorithmic complexity of a parser is not dependent in any way
on the quantity of input.
If a parser is linear with respect to input tokens, that means that if
you double the amount of input then processing takes roughly twice as
long, but this is still linear.
> Furthermore:
>
> Linear bounded automata are acceptors for
> the class of context-sensitive languages. [1]
>
> Therefore XML is more powerful than a context-sensitive language.
I think your "therefore" is a bit iffy here.
if p implies q, and not(p), we cannot say, therefore not q.
My understanding is that XML is at worst LL(2), not LL(n), in parser
complexity. So it can be parsed by a deterministic finite state
automaton, as described e.g. in the Dragon Book, the standard text for
such things (Aho, Hopcraft & Ullman). The 2 is because you may need two
tokens of look-ahead before you can reduce, in a shift-reduce parser,
although that's a supposition on my part.
Please, let's not spread fear-uncertainty-doubt around without good
cause. The "billion laughs" attack you mentioned was a nonsense paraded
triumphantly by people who hated XML, when the languages they preferred
(SGML and JavaScript) were susceptible to the same attack and had the
same simple fix:
if (this would use too much memory) {
don't do it.
}
Liam
--
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]