XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Auto-generate a DTD from multiple XML documents?


> Several of us involved with Distributed Proofreaders and Project
> Gutenberg are analyzing a number of TEI documents representing PG
> etexts.

If you know that the documents are all valid to the full TEI DTD then
you are in a much strnger position than just trying to infer a DTD from
a set of instances. You can, essentially, just use a simple xpath
expression (or just perl, probably) to get a list of element and
attribute names used in your instances then take the TEI DTD and just
delete any references to elements not used in your instances. You may
have a few small manual changes to make the resulting grammar
deterministic, but basically you are done.

Alternatively perhaps you could just get the TEI Pizza chef to bake you a
small DTD
http://www.tei-c.org/pizza.html

David


________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS