OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: Namespace handling in XML Processors

[ Lists Home | Date Index | Thread Index ]
  • From: "W. Eliot Kimber" <eliot@isogen.com>
  • To: veda <m.vedachalam@tatainfotech.com>
  • Date: Tue, 31 Aug 1999 13:50:38 +0100

veda wrote:
> Hai all....
> Could someone tell me what is
> internal subset, external subset, Namespaces in XML.
> Is there any website which discusses them and the XML terms in general.

An XML document consists of three parts, two of which must always be
explicitly present (the third is implicit if not provided):

1. The XML declaration: <?xml version="1.0"?>
2. The DOCTYPE declaration (which can be omitted)
3. The document instance (the tag stuff)

The DOCTYPE declaration declares the name of the root element (the
"document type") and declares the element types used in the instance. It
also contains declarations of any entities used by the document
(entities are either string macros or files (technically, abstract
storage objects).

Logically the DOCTYPE declaration is a flat list of element type,
attribute list, and entity declarations (and notation declarations, but
I haven't talked about those). Physically, the declaration can be
organized into two parts, one in the document's main file (the "document
entity", that is, the file that contains the XML declaration, the
DOCTYPE declaration (if there is one), and the root element) and one in
an external file.

Because these two parts of the DOCTYPE declaration make up the larger
whole, they are both subsets. The one inside the document is the
"internal" subset and the one outside the document is the "external"

A typical DOCTYPE declaration looks like this:

<!DOCTYPE foo SYSTEM "myexternalsubset.dtd" [
  <!-- This is the internal subset -->
  <!ELEMENT foo (#PCDATA) ><!-- Declaration of element type 'foo' -->

The external subset is the file named by the filename following the
SYSTEM keyword. 

While logically there is no difference between the internal and external
subsets, XML defines slightly different rules for how the internal and
external subsets must be processed. This reflects common (although
misguided, IMNSHO) practice and makes things easier for processors that
don't do validation and therefore don't need to process the

One key rule about internal and external subsets is that the internal
subset is always parsed *before* the external subset, which means that
any entities declared in the internal subset will take precedence over
entities with the same name declared in the external subset. This allows
a weak form of modularization of declaration sets: external DTD subsets
can provide entities that are intended to be redeclared in internal



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS