[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: should all XML parsers reject non-deterministic content models?

From: Joe English <jenglish@flightlab.com>
To: xml-dev@lists.xml.org
Date: Sun, 14 Jan 2001 09:58:16 -0800

TAKAHASHI Hideo wrote:

> I understand that the XML 1.0 spec prohibits non-deterministic (or,
> ambiguous) content models (for compatibility, to be precise).
> Are all xml 1.0 compliant xml processing software required to reject
> DTDs with such content models?

No: a processor can ignore the DTD entirely and still be compliant.
And since the prohibition against non-deterministic content models
appears in a non-normative appendix, I would presume that conforming
DTD-aware processors are not required to detect this condition either.
Even in full SGML, ambiguous content models are a "non-reportable
markup error", i.e., parser don't need to detect this condition.

> Ambiguous content models doesn't cause any problems when you construct a
> DFA via an NFA.  I have heard that there is a way to construct DFAs
> directly from regexps without making an NFA, but that method can't
> handle non-deterministic regular expressions.

There are many, many other ways to validate documents against content
models though.  Take a look at James Clark's TREX implementation,
which has no problem with ambiguity, and also efficiently handles
intersection, negation, and interleaving of content models
(the first two of which are *very* expensive in a DFA-based
approach).

> If you choose that method
> to construct your DFA, you will surely benefit from the rule in XML 1.0
> . But if you choose not, detecting non-deterministic content models
> become an extra job.

But note that detecting ambiguity in XML content models is considerably
simpler than in SGML -- the really difficult part involves '&' groups
which aren't present in XML.

--Joe English

  jenglish@flightlab.com

Follow-Ups:
- RE: should all XML parsers reject non-deterministic content models?
  - From: Danny Ayers <danny@panlanka.net>

References:
- should all XML parsers reject non-deterministic content models?
  - From: "TAKAHASHI Hideo(BSD-13G)" <hideo-t@bisd.hitachi.co.jp>

Prev by Date: SVGSpider.com - the world's "first" all SVG web site?
Next by Date: RE: should all XML parsers reject non-deterministic content models?
Previous by thread: Re: should all XML parsers reject non-deterministic content models?
Next by thread: RE: should all XML parsers reject non-deterministic content models?
Index(es):
- Date
- Thread