OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Xml Revisited


I found the "tagged data format" aspects of XML very straight forward, but
found CDATA and PCDATA, most of DTDs and the explanation of Entities very
strange.  Character references were also straight forward whether defined
internally, externally or built in.

I came from a C language, database and languages background and had barely
heard of SGML as somehow related to HTML.

When the X++ stuff came out I thought - finally someone who speaks my
language - you have choice between RelaxNG and Xml Schema, there are types
if you want them and rich primitive data types.

Namespaces were also fairly understandable.

I get the feeling that for people who came out of SGML the way of explaining
much of Xml in the W3C recommendation was totally straight forward - but for
me there was little synergy with terms I already knew.  

I kept wondering how something so simple could use such convoluted terms.
An entity to me was something in Entity Relationship modeling.  A file was
something you included.  A compiland (Pascal) was something you imported -
or a package in Java.

So, on a topic here called "Xml Revisited" I am sure people would have lots
to say.

Also the name Extensible Markup Language is a misnomer.  XML is not a
language but a general meta grammar for creating and number of "languages".

On the Wiki article, however, we have to stick to the current Xml


-----Original Message-----
From: Michael Ludwig [mailto:mlu@as-guides.com] 
Sent: Monday, August 24, 2009 2:08 AM
To: XML Developers List
Subject: Re: [xml-dev] Wikipedia on XML

Elliotte Rusty Harold schrieb:
> On Sun, Aug 23, 2009 at 2:22 PM, Michael Ludwig<milu71@gmx.de> wrote:
>> So given the rest is pretty useful and the DTD syntax and
>> functionality is really easy to learn and understand, why should it
>> have been a mistake to include this great bag of features in XML?
> The internal DTD subset has been a world of hurt for parser
> implementers. It's really what pushes XML over the edge out of
> the realm of the Desperate Perl Hacker.

Sorry to hear it hurt so much. On the other hand, did anybody
seriously expect the DPH to write his own parser?

When I came in touch with XML for the first time in 2001, I was a
novice DPH getting *horribly* bogged down in writing CGI scripts
with complicated 100 line subroutines; programming wasn't easy to
learn, and it took me a lot of effort. XML and DTD, on the other
hand, were easy and intuitive; and I quickly reached some (albeit
modest) level of productivity.

Instead of writing my own parser, of course, I used Expat or other
parsers. I never wrote my own.

The whole XML business got much more difficult and confusing (and
discouraging) when I read about this plethora of new-fangled X++
technologies growing up around XML. Why was all that necessary? Why
would I have to know or care? I got the impression that the simple
system XML+DTD wasn't good enough any more, was somehow deprecated.

> It makes parsers much more complex, and arguably slower. It also
> introduces some security issues that wouldn't otherwise be present.

Filesystem and network access? That would hold true for anything
accessing the filesystem and the network.

If speed is very important, I think that a parser could be written
so as to proceed to a speedy DTD-unaware bare-bones implementation
when there is no DOCTYPE present.

> Were we starting over today, I would argue strongly in favor of
> eliminating the internal DTD subset entirely and leaving the
> definition of the schema language outside the spec so that the
> DOCTYPE could point to schemas in different languages which
> parser vendors would be free to implement or not as they chose.

Precisely why the internal DTD subset should be such a problem,
I don't understand. Because it cannot be ignored? Complexity,
slowness and security should result from the external subset in
the same way, shouldn't they?

Making the DOCTYPE work with multiple schemas sounds reasonable
to me. Also, the DTD could surely be enhanced to accomodate new

For historical reasons, the DTD is here; it's a legacy. That
doesn't have to be bad. It could also be considered a useful
extension point for XML.

Michael Ludwig


XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS