OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Entity support in XML parsers (was: UTF-8+names)

[ Lists Home | Date Index | Thread Index ]

Bob Foster scripsit:

> XML does not require parsers to support entities apart from validation

ObMemeStomping:  This is not true.  XML processors, validating or not,
must support internal entities that are defined in the internal subset.

Something I've long wanted to see is a tool that takes a full DTD
and squashes it down to the bare minimum required for use as an internal
subset to preserve all DTD infoset effects but *not* validity.  In

1) Expand all parameter entities and eliminate all parameter entity

2) Eliminate all attribute declarations that are CDATA and either #IMPLIED

3) Eliminate all element declarations that are ANY, EMPTY, or mixed content,
   and simplify all element-content ones to <!ELEMENT foo (foo)> or
   something of the sort

A reasonable implementation strategy would be to start with James Clark's
DTDinst program (http://www.thaiopensource.com/relaxng/dtdinst) and
then use XSLT (with text output mode) to generate the new DTD.

Anyone interested in tackling it?

In politics, obedience and support      John Cowan <jcowan@reutershealth.com>
are the same thing.  --Hannah Arendt    http://www.ccil.org/~cowan


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS