OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: FW: Namespaces and DTDs

[ Lists Home | Date Index | Thread Index ]
  • From: <Marc.McDonald@Design-Intelligence.com>
  • To: <cbullard@hiwaay.net>, <martind@netfolder.com>
  • Date: Fri, 12 Mar 1999 18:10:44 -0800

It's quite true that you can have XML that does not require validation 
and that this is commonly done. An exception is the defaulting of the 
value of any attributes of elements in a DTD, which has been mentioned 
in another reply.

You can construct a DOM without validation, but the next step ends up 
being a procedural implementation of picking apart the DOM document 
tree to construct whatever structure the application using DOM 
requires to interpret the document.

I can parse:
  <book title="tale of 2 cities">
    <chapter>
      <para>..<para>
    </chapter>
    <chapter>
        ...
    </chapter>
      ...
  </book>
without a DTD.

But if my application needs to get the information out of the DOM I 
need to write code to:
  Create a representation for Book consisting of a title and chapters 
and get book from DOM
  Create a representation for each Chapter and get Chapters from DOM
  Create a representation for each paragraph in a chapter and get 
paragraphs from DOM.
Part of this is what is expressed in the DTD. Wouldn't it be better if 
a system were created that used the DTD on the receiving end to create 
the application representation instead of serializing it back into 
elements and constructing a new tree?

Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Didier PH Martin [SMTP:martind@netfolder.com]
Sent:  Friday, March 12, 1999 5:20 PM
To:  Marc McDonald; cbullard@hiwaay.net
Cc:  xml-dev@ic.ac.uk
Subject:  RE: FW: Namespaces and DTDs

Hi Marc,

<YourComment>
Actually there is another representation of the information in the 
DTD
that is present: the application that uses the document. 
Unfortunately
the representation is in C++, Java or some other language. This
introduces a synchronization problem between the two.

The DOM api for instance gives you access to the parsed document 
tree,
but a sizable amount of independent code must be written to
essentially parse the DOM tree into the form the application needs.
The result is the structure is in 2 different forms, declarative and
procedural, which must be kept in sync.
</YourComment>

<Reply>
You are right. but I can construct a DOM without any validation. The 
whole
point here is: if I need validation at the receiving end why not use 
SGML
which is more elaborate and necessarily need validation (because of 
the
possibility to have omittags). If however, we do not need validation 
at the
receiving end then, we are better to use XML that, because of its 
structure,
can be parsed without validation and then a DOM could be created for
procedural language consumption.

But you are right to say that from the serialized format I have to 
construct
a model (i.e. a structure) that interpreters can access. The DOM is 
the XML
way to do it and the grove for the SGML way (DOM and grove concept 
are
similar enough to reduce one to the other)

to become useful XML life cycle could be expressed like:

a) XML format creation: we need a DTD, so that the editor can validate 
the
document or simply prevent me to create an invalid document.
b) transport
c) receiving end: interpretation. The interpreter needs a parser. A
validating parser is not necesssary with XML, It seems that we have 
several
kinds of parsers:
	1- event driven
	2- function call within a loop
	3- DOM producer
d) The interpreter knows the semantic and do something.

In fact, XML rules do not convey semantics only syntax. Xpointers or 
Xlinks
are domain specific languages that add a semantic layer to XML. XHTML 
also.

In fact, all these concept where existing in the SGML world. Waht we 
gained
with XML compared to SGML is simplier parsing rule. So simple that
validation is no longer necessary to do a complete parsing operation. 
The
SGML syntax is more tricky because you need to tell the parser that 
some
markups are not with an end tag, thus, the need for a DTD which has 
the main
function to tell the parser some parsing rules like where a tag begin 
and
end. So, because of the "well formed" constraint we gained that now 
parser
do not a DTD to accomplish their task, the rule is clear on how a 
markup
begin and ends.

My conclusion:
we gained with XML the fact that a parser do not need to do 
validation.
Otherwise its only changing the XML extension to a sgml document. So, 
to go
from "mydocument.sgml" to "mydocument.xml" whitout really changing 
anything
except some minor modifications in the DTD declaration. That may be 
good for
marketing reasons but surely not for technical reasons.

</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS