xml-dev - RE: [xml-dev] The triples datamodel -- was Re: [xml-dev]SemanticWeb perm

RE: [xml-dev] The triples datamodel -- was Re: [xml-dev]SemanticWeb perm

[ Lists Home | Date Index | Thread Index ]

To: "Kirkham, Pete (UK)" <pete.kirkham@baesystems.com>
Subject: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev]SemanticWeb permathread, iteration n+1
From: Henrik Martensson <henrik.martensson@bostream.nu>
Date: Sun, 13 Jun 2004 19:42:00 +0200
Cc: Elliotte Rusty Harold <elharo@metalab.unc.edu>, XML Developer List <xml-dev@lists.xml.org>
In-reply-to: <820DBA1A8ECA1D45A557AFD03CF4DEE06E4FEE@glkms0015>
References: <820DBA1A8ECA1D45A557AFD03CF4DEE06E4FEE@glkms0015>

On Tue, 2004-06-08 at 12:01, Kirkham, Pete (UK) wrote:
> > Henrik Martensson
> > Funny you should bring XP up, because XP takes a very rigid
> > approach to testing, or validation.
> 
> Validation of software is not the same as validation against a schema:

I agree.

> 
> Testing allows the behaviour of the system under a certain subset of the inputs to be asserted to be consistent with a datum set of behaviours.

Yes, and consequently, it is necessary to choose the input data well, or
the tests may become meaningless.

> 
> Validation allows the behaviour of the system under a certain subset of the inputs to be asserted to be consistent with the user's conceptual model of the process the software embodies.

Ok.

> 
> Verification asserts the behaviour of the system under *all* cases to be conformant with a formally specified model.

Ok.

> 
> 
> I would suggest that schema validation- the use of a schema as a model against with to verify instances of XML - has more of the characteristics of software verification than of testing or validation.

Yes, seen as a standalone process, isolated from context. However, in
context, it is different. For example, it can be very useful to verify a
document against a schema as part of a test.

> 
> I've never heard of any use of verification in XP, only testing and validation.

I hadn't either, until I started doing it about a year and a half ago. I
was working on an XMetaL based authoring client. I used automated unit
tests to validate the functionality I was developing. I found
verification of documents against the DTDs we were developing to be very
useful:

* There were a great number of document templates in the system.
  By verifying them against the DTD, it was easy to check that they were
  always up to date. This was particularly useful, because the people
  doing the DTD specifications went nuts and changed the specification
  more or less at random more than 80 times. (It turned out that the
  customer's "expert team" had never actually worked with XML, or
  document management systems, before.)
* I got a sort of reverse check for free: since all major structures
  in the DTD were represented in the templates, backwards incompatible
  DTD changes showed up as verification errors. This wasn't to
  dependable of course, nevertheless, it had its uses.
* The editor tests that did not use the templates, used minimal
  documents instead. Verifying against the DTD (which I got for free,
  no extra code, very little extra time) was a simple way to ensure that
  these tests were still relevant, i.e. that the structures in them were
  still compliant with the requirements of the rest of the system.

In all, the tests allowed me to keep up with the continuosly changing
specifications. I finished my part of the project in six months, which
was within the original estimate. Other subprojects did not use this
kind of testing (they actually refused despite orders from the project
management). They are still at work on their parts, after more than
eighteen months...

> 
> Verification helps give people confidence in high integrity systems, at a cost of orders of magnitude in development and response times. The criticality of most systems doesn't currently justify that cost.

Well, yes, and no. For high volume, fully automated systems, that can
certainly be true, but when it comes to human authored XML documents,
the delays due to verification against DTDs are usually not perceptible.
Verification is usually triggered under the following circumstances:

* When a document is opened. Verification causes a delay that is
  insignificant compared to the time required to read, parse and
  format the document.
* When a user triggers it manually. Validation time is usually
  considerably less than a second. I have never encountered anyone
  bothered by this delay.
* When view mode is changed from text mode to a structured editing
  mode. Again, validation time is insignificant compared to all the
  other stuff going on. It is very useful though, because editing in
  text mode is risky.
* When a document is saved, or checked in into a DMS. Again,
  verification time is negligible, particularly if a DMS is used.
  (Checking a document into a DMS is not a simple process.)

Thus, the cost is, in practise, very low. Criticality, on the other
hand, can be, and often is, very high.

Then there are other benefits of working with a DTD/schema. For example,
A DTD usually specifies 100-300 elements, and many more attributes.
Having tag lists 300 elements long would be very inefficient. It is far
better to show just the inline tags when a user edits inline, block
structures when the user is going to insert a new block, etc.

When Ericsson went from unstructured authoring, using FrameMaker, to
structured authoring using the Arbortext SGML editor, the time required
to produce a single page of documentation went from 4.3 hours to  2.7
hours, and CD production time was reduced by 50%. At the same time, the
quality of the information was improved. This was without a DMS. The
difference in time is likely to be at least partly due to verification
against a DTD, which reduced the amount of rework due to mistakes, and
the context awareness of the editing environment. (There is a case study
in "ABCD... SGML", by Liora Alschuler.)

It has been suggested in this mailing list, quite frequently, that
authoring documents with well formed XML would be more efficient than
using validating XML editors. Well, using well formed XML would be
similar to using FrameMaker, because authors can make up their own tags.
In reality though, it would be even slower, because while an author can
specify how to process a new tag (at least for some purposes) with
FrameMaker, an author using well formed XML can't do the same thing.
(Even if he/she could create an XSL stylesheet, there is no way to apply
it, or package it with the document. There is no way to tell the DMS how
to process the new element, tell various post processing systems what to
do, etc.) FrameMaker is actually a very good tool for unstructured
authoring, and I believe that it would beat most tools for authoring
well formed XML hands down.

/Henrik

References:
- RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWebpermathread, iteration n+1
  - From: "Kirkham, Pete (UK)" <pete.kirkham@baesystems.com>

Prev by Date: Re: [xml-dev] How I learned to stop worrying and love the semantic web - was Re: [xml-dev] Meta-somethingorother
Next by Date: Re: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWeb permathread, iteration n+1
Previous by thread: RE: [xml-dev] The triples datamodel -- was Re: [xml-dev] SemanticWebpermathread, iteration n+1
Next by thread: RE: [xml-dev] Conformance Testing and Schemas (WAS RE: [xml-dev] The triples da tamodel -- was Re: [xml-dev] Sema ntic Web permathread, iteration n+1 )
Index(es):
- Date
- Thread