OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Use DTDs!

On Fri, 2016-04-29 at 17:40 +0000, Costello, Roger L. wrote:
> Hi Folks,
> Heavy-duty validation is not always needed. Sometimes all that is
> needed is to verify that XML instances are using the right set of
> tags.
> Let me state it stronger: I have observed that "verifying that XML
> instances are using the right set of tags" is how most developers
> view XML validation. Their XML Schemas are merely thinly veiled
> versions of DTDs. Developers opt to perform the heavy duty data
> checking in Java code and/or in a database.
> For many (most?) situations use DTDs, not XML Schemas. Here's why:
> 1. Less tools needed: Only one tool is needed - a validating XML
> processor. Conversely, if you use XML Schemas for validation, you
> need two tools - an XML processor plus an XML Schema validator. The
> less tools needed, the better.

That is like saying, use a bicycle not a moped, because a moped has
wheels and an engine and a bicycle has only the wheels. It's true that
more components means more things to go wrong, but choose the best tool
for the job.

> 2. Less to read: If you stick to DTDs, you only have to read the XML
> specification, which is about 36 pages long. If you use XML Schemas,
> you have to read XML Schema Part 1, which is around 350 pages and XML
> Schema Part 2, which is around 100 pages.

Most people working with XML read neither of these. They read a book on
XML, some stack overflow questions, the tool documentation.

> 3. Less complexity: DTDs are several orders of magnitude simpler than
> XML Schemas.

I hope you use a pocket calculator rather than a computer to do all
your word processing.

> 4. Less verbosity: The DTD syntax is streamlined and efficient (kind
> of analogous to XPath in terms of being streamlined and efficient).
> The XML Schema syntax, on the other hand, is bloated and inefficient.

"bloated" is usually a perjorative but meaningless term. I'm not sure
what you mean by saying a syntax is inefficent. Unweildy perhaps. But
it's readable without learning the arcane and bizarre DTD syntax. So
there are arguments on both sides here.

> 5. Robust validating tools: The capability of validating against a
> DTD has been around a long time, the tools are rock-solid.
> Comparatively, the capability of validating against an XML Schema has
> been around a short time, the tools are less rock-solid.

This one is nonsense, Roger. XML Schema first became a W3C
Recomendation in May 2001, 15 years ago. XML itself first became a W3C
Recommendation in February of 1998, roughly 18 years ago.

> 6. Inexpensive: Validating XML processors are either free or
> inexpensive. True, there are some free XML Schema validators, but
> some of the most popular XML Schema validators are quite pricy.

XML Schema for Java (part of the Java JDK) and xmllint in C are both
zero dollar robust tools, for example. Don't drive a ford fiesta becase
a porsche is expensive?

> 7. Suited to Architectural Forms

Even if architectural forms were in use with XML using the old DTD
processing instruction syntax, they can be placed in the internal
subset; XML requires that they be passwed to the application regardless
of whether you areusing "DTD" markup declarations to constrain your

In addition, it's possible (and I think not all that uncommon) to use
both a DTD and an XML Schema.

> 8. Infoset happiness: [John Cowan wrote:] There are a number of XML
> DTD features which affect the infoset returned by a compliant
> parser.  If they are in the internal subset, the parser MUST respect
> them; if they are in the external subset, then any parser that reads
> the external subset likewise MUST respect them.

I don't think the emotional stability of the XML information set is at

I use DTDs myself for simple applications, and, partly because I _like_
bizarre and arcane languages :-), can mostly remember the syntax,
despite some quirks e.g. around parameter entity expansion in general
entity values declared in the internal subset when the moon is full.

An advantage of an XML Schema over a DTD or even RNG schema can be type
assignment, which can be used in XQuery. Another can be, if you are
careful, an increased ability to process the schema with XSLT and other
XML tools.

For really simple cases a DTD will be shorter and DTDs are by no means
dead, but this sent of arguments isn't particularly well-made I think.
You can do better! ;-)



Liam R. E. Quin <liam@w3.org>
The World Wide Web Consortium (W3C)

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS