[
Lists Home |
Date Index |
Thread Index
]
- From: Jon Noring <noring@netcom.com>
- To: xml-dev@xml.org
- Date: Tue, 7 Mar 2000 23:06:40 -0800 (PST)
"Frank Boumphrey" <bckman@ix.netcom.com> wrote:
>David Megginson <david@megginson.com> wrote:
>> I would highly recommend either (a) starting with TEI or TEI-lite and
>> removing stuff, or (b) starting with XHTML and adding stuff. The Text
>> Encoding Initiativen, in particular, has spent a lot of time working
>> out the kinds of problems you'll be facing (including encoding drama
>> and verse of all descriptions), and several e-text archives, including
>> the Oxford Text Archive, already use TEI. Alternatively, XHTML is
>> well known, and you could add markup from a Gutenberg namespace (or
>> even from a TEI Namespace).
>>
>> DocBook is not really suitable for working with literature, though it
>> is (obviously) fantastic for technical documentation.
>I think TEI is a great DTD, but it is research orientated and IMO it is not
>really suitable for mass markup of popular texts. Most people find the
>inital learning curve quite steep (I know I did, much steeper than DocBook
>which I found almost intuitive) even though when one has gone through that
>curve it is quite easy to use.
>
>With Gutenberg our primary objective is getting books marked up in XML using
>a simple but sound schema, rather than using a definitive academic schema.
>Having said that, no one would be more delighted than my self if we got a
>corpus of works marked up in TEI....
As I continue to explore and study the needs of the upcoming e-book
revolution, I have concluded that developing a basic, general, fairly
compact, structurally-oriented e-book publishing DTD makes a lot of sense.
If designed right, it should be able to admirably structure for
presentation, say, 80-90% of all future e-books (most of which will have
simpler layouts), the remaining fraction needing a "plug-in" DTD module to
augment the basic DTD, or to use a different DTD.
Such a DTD would allow the e-book industry to simplify, share style sheets,
develop fairly simple authoring tools built around that DTD, etc., and so
enable a unified approach to authoring, publishing, distribution and
rendering of e-books.
This e-book DTD, although structurally-oriented, would essentially identify
the structure that is important for presentational purposes, and not for
"scholarly" and other non-presentational purposes (although means to allow
non-presentational structuring in a universally recognized way must be
included as part of the "spec".)
I envision this DTD to easily allow, as I mentioned above, modular DTD
"plug-ins" (which themselves are hopefully standardized), for more complex
structuring for specialized applications. (Now whether this can be done in
an intelligent and overall simple way, I don't know, so I defer to the
experts on this mailing list, many of whom have years of experience building
various types of text and publishing DTD's.)
And of course, this DTD would be a completely open specification, hopefully
sponsored by Open eBook (OEB) or similar organization devoted to open
standards.
Now, assuming that what I said so far makes any sense and is not a quixotic
quest and ramblings of a madman, the first step is obviously to establish a
clear vision statement and a set of principles/goals meeting the vision.
Once this is completed, I really think that the DTD should fall into place
fairly quickly. One could instead just jump into it, put together an ad-hoc
one (I could do that in a few hours), and then tweak that until it "looks"
right, but that is not the way to do it. In my opinion, one must author and
agree to a vision statement and set of principles/goals -- these principles/
goals will then be used for all group decision making of how to build the
DTD.
To jump the gun, though, one will likely ask the first important question:
"Should we start with one of the established 'publishing' DTD's, extract a
subset (ala Pizza Chef for TEI), and then work from that, or should we only
refer to them for inspiration, take the "best" from all of them, and build a
new DTD from scratch that meets our vision statement?"
This will not be an easy question to answer, and there will likely be
advocates of one side or another, and each will give good reasons. There
will also be philosophical differences, such as high-level structure. The
publishing DTD's I'm aware of include TEI (and TEI-Lite and even the "Bare
Bones TEI"), DocBook, and ISO 12083. Any others I've missed?
I mention high-level structure since my perusal of TEI, DocBook and ISO
12083 shows they take quite different approaches to high level structure.
TEI actually uses a fairly simple one using generic "div" for the various
levels of the higher level structure. One can use the "type" attribute to
define what each "div" represents (e.g., chapter, section, subsection, etc.)
For e-book presentation, if one uses the TEI "div" approach in the e-book
publishing DTD, one will have to predefine a few allowable values of the
"type" attribute (for the front, body and back matter) so we can make it
easier to enable industry-wide sharing of style sheets and build simple
authoring tools. (Again, I don't know if this is ultimately possible, but
it is a design goal by my reckoning -- industry-wide interchangable style
sheets -- again I defer to the long-time experts here.)
I could go on and on as I've thought a lot about this for a while, but this
is more than enough to chew on already. And this hopefully will build upon
or augment the discussion now occuring about HWG's Project Gutenberg Texts
XML tagging project.
Comments? Criticisms? Anybody interested in working on this project if
it does make sense?
Jon Noring
Yomu
***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************
|