OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Wikipedia on XML

Elliotte Rusty Harold schrieb:
> On Sun, Aug 23, 2009 at 2:22 PM, Michael Ludwig<milu71@gmx.de> wrote:
>> So given the rest is pretty useful and the DTD syntax and
>> functionality is really easy to learn and understand, why should it
>> have been a mistake to include this great bag of features in XML?
> The internal DTD subset has been a world of hurt for parser
> implementers. It's really what pushes XML over the edge out of
> the realm of the Desperate Perl Hacker.

Sorry to hear it hurt so much. On the other hand, did anybody
seriously expect the DPH to write his own parser?

When I came in touch with XML for the first time in 2001, I was a
novice DPH getting *horribly* bogged down in writing CGI scripts
with complicated 100 line subroutines; programming wasn't easy to
learn, and it took me a lot of effort. XML and DTD, on the other
hand, were easy and intuitive; and I quickly reached some (albeit
modest) level of productivity.

Instead of writing my own parser, of course, I used Expat or other
parsers. I never wrote my own.

The whole XML business got much more difficult and confusing (and
discouraging) when I read about this plethora of new-fangled X++
technologies growing up around XML. Why was all that necessary? Why
would I have to know or care? I got the impression that the simple
system XML+DTD wasn't good enough any more, was somehow deprecated.

> It makes parsers much more complex, and arguably slower. It also
> introduces some security issues that wouldn't otherwise be present.

Filesystem and network access? That would hold true for anything
accessing the filesystem and the network.

If speed is very important, I think that a parser could be written
so as to proceed to a speedy DTD-unaware bare-bones implementation
when there is no DOCTYPE present.

> Were we starting over today, I would argue strongly in favor of
> eliminating the internal DTD subset entirely and leaving the
> definition of the schema language outside the spec so that the
> DOCTYPE could point to schemas in different languages which
> parser vendors would be free to implement or not as they chose.

Precisely why the internal DTD subset should be such a problem,
I don't understand. Because it cannot be ignored? Complexity,
slowness and security should result from the external subset in
the same way, shouldn't they?

Making the DOCTYPE work with multiple schemas sounds reasonable
to me. Also, the DTD could surely be enhanced to accomodate new

For historical reasons, the DTD is here; it's a legacy. That
doesn't have to be bad. It could also be considered a useful
extension point for XML.

Michael Ludwig

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS