OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] XML-appropriate editing data structures

[ Lists Home | Date Index | Thread Index ]

On Fri, 2004-04-09 at 01:46, Elliotte Rusty Harold wrote:

> Because it is very common to need to create, either temporarily or 
> permanently, invalid content. For instance, when I was writing 
> Processing XML with Java in DocBook I typed many xinclude:include 
> elements that were not provided for by the DocBook DTD.  In other 
> cases I've added custom elements and attributes that are not supplied 
> by the DTD.

So, what if a company has 150 authors, and each of those authors are
allowed to implement their own "useful" tag sets?

Allowing authors to add tags not supported by the DTD (and therefore not
by the processing systems that are designed to support what the DTD
supports, nothing less, nothing more) renders the use of XML

There are enough problems getting authors to use what the DTD does
provide in a correct manner.

The tools I work with keep tags and structure in tight rein, which I am
grateful for. However, I can provide another example of the consequences
of tag abuse:

Not too long ago I encountered a case where an author was supposed to
use a procedure structure that looked a bit like this:

<!ELEMENT procedure ((p|list|table)*, (step|bridge)*)>

The author promptly proceeded to write a procedure like this:


One consequence was that instead of writing procedure steps with
illustrations like this:


the author wrote them like this:


Imagine the consequences when steps are supposed to be reusable in a
DMS, or the formatting rules require a two column layout for procedure
steps with images to the left and text to the right.

This document contained lots of other markup horrors, too many to list.

The document wasn't authored by this writer alone, but instead of
correcting the mistakes, others started participating in the insanity.
The document was then reviewed, and passed because the reviewers "don't
like to look at tags". Following the review, the document was sent to
translation, so the markup errors were propagated from the DMS to the
translation database.

If we had not caught it when we did (which was entirely by chance) this
document, and others like it, could cause serious production delays. The
cost could run into millions of Euros. (The document cannot be formatted
for printing properly. It is also likely that it will mess up an online
help system. Using it as is, is simply not an option.)

If the authors had been allowed to invent their own tags as they go
along, the problems could get even worse...

Of course I realize that when you insert an XInclude tag, you know what
you are doing, but XML editors should not be targeted primarily at XML
experts, they should be targeted at writers. Most would not know
XInclude from dexitroboping, and there is no reason they should. (Unless
they have to write about dexitroboping.)

In an industrial setting the thing to do would be to modify the Docbook
DTD, so that it contains the xinclude element. That is also the solution
I would choose if I were writing something for personal use. That way I
can have the editor assist me with the xinclude element just as with the
Docbook elements. If I use the xinclude element more than once, it is
worth the effort. If it really was a one time thing, I might have
included the element in the internal DTD subset. More likely, I would
have tried to find a workaround, not using xinclude at all.

> And in other cases, I do want valid markup. I just don't want it yet. 
> I'm not ready to fill in everything that's required. For instance, I 
> might begin a book by typing out an outline as section titles, 
> without actually giving the sections any content, though that is 
> ultimately required. But I can probably put together an outline in 
> day, even though filling in the content may take a year, as long as 
> the editor doesn't keep bugging me about the missing parts.

But that is not really an editor problem. It happens only if you have a
bad authoring DTD.

I have seen authoring systems built around production or exchange DTDs
several times. It is never pretty. As for Docbook, it is meant to be
customized, and including an xinclude element in an authoring or
production version of Docbook would be quite ok, even though it
shouldn't be done in an exchange version. (Well, generally speaking. I
can imagine exceptions.)

> I've tried editors that attempt to maintain validity. And they're 
> just bloody annoying. Even if you want valid markup, they're either 
> pestering you with pop-ups; or  filling in what they think you'll add 
> and guessing wrong. (There's often more than one possible child 
> element that can be added to make a parent element valid.) Bottom 
> line: they get in your way. They are not smart enough to figure out 
> what needs to be done, but they do something anyway.

That is not the editor either. It is the customizations are wrong.

I have been in the same situation myself, but never because of the
editor (not since I stopped using WordPerfect anyway). The problem is
with the DTD, the customizations, or (all too frequently) both.

An XML editor isn't a complete, ready-to-use-out-of-the-box authoring
solution. It is a platform for developing a customized authoring
environment. Not implementing a certain feature in the platform would
certainly prevent abuse, but it would also prevent correct use.

Personally, I would prefer giving authors more latitude than most
industrial authoring systems do, but to do that, the authors must be
educated about how to write structured documents, and about XML. Also,
good review policies must be in place. So far, I have only seen one
company that has even come close to doing that.



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS