[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] What techniques do you employ to ensure that your data is precisely and unambiguously specified?
- From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
- To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
- Date: Mon, 24 Sep 2012 12:41:36 -0400
At 2012-09-24 16:12 +0000, Costello, Roger L. wrote:
>How carefully do you specify your data? Is it free from ambiguity
>and misinterpretation?
>
>Although all data should be specified precisely and without
>ambiguity, the consequences of data being imprecise or ambiguous
>ranges from minor to catastrophic. There is a name for the latter
>data: level 1 data.
>
> Level 1 (critical) data: the data must be specified
> precisely with absolutely no ambiguity. For if the data
> is misinterpreted, then there is a real
> possibility of a loss of human life or financial,
> political, or personal calamity.
>
>When you create your XML Schemas do you identify the level of the
>data? For example, do your XML documents contains a <level> element:
>
> <Document>
> <level>level 1 (critical) data</data>
> ...
> </Document>
>
>Has anyone created an XML Schema for the various levels?
>
>[Key Question:] If you have level 1 data what techniques do you
>employ to ensure that there is virtually zero chance for the data to
>be misunderstood or misinterpreted?
At a Balisage conference many years ago (if readers haven't heard of
Balisage, it is *the* conference to go to!) I presented my use of
what I called "reference numbers in an XPath file" in order to
unambiguously specify XML data that belongs at a particular place on
the printed page when writing a stylesheet.
The problem being solved is that fully-qualified absolute XPath
addresses are too lengthy to write into small boxes on the printed
page of an invoice. In an "XPath file" each reference number is a
compact cross reference to a fully-qualified absolute XPath
address. I mock up the end result on a piece of paper and ask the
committee member to fill in the boxes of the page with the reference
numbers guiding me unambiguously on how to write my stylesheet. When
working with clients, I ask them both to create the mockup and
populate the page with a set of reference numbers.
One can produce these reference numbers either from the schema (I
only supported UBL schemas because of the regular nature by which the
schemas are expressed; my implementation does not work for arbitrary
schemas), or from an arbitrary instance (doesn't have to be UBL, can
be any XML document).
It turns out the "instance XPath file" has become very helpful in
other projects and with clients. Take any XML instance, create its
XPath file, then cite the reference numbers to unambiguously talk
about any element or attribute in the document.
But XPath files are fragile ... if you change anything about the
document, the reference numbers break and you have to recreate the
XPath file, which then might invalidate any work you've done already
by citing the old reference numbers. So I just tell my client to
freeze the XML document before generating the XPath file for it, then
use the reference numbers without changing the file.
You can download the resources I have here:
http://www.CraneSoftwrights.com/resources/ubl/index.htm#xpath
Consider the Crane-xml2xpath package. Download the stylesheets and
you'll find that you can feed any arbitrary XML document into
Crane-xml2xpath.xsl to produce the XPath file in text. You use
Crane-xml2xpath-html.xsl to produce the XPath file in HTML.
Then, should you wish, you can convert the original file into a new
XML document I call the "XPath Instance" document. This document
mirrors the structure of the input XML document but populates every
element and every attribute with the corresponding reference
number. This is helpful when writing stylesheets because stylesheets
do not (necessarily) do validation. So when you've written your
stylesheet and you put the XPath instance document in as input, what
you end up getting in output is the set of reference numbers laid out
on the page to use for a visual validation that your stylesheet
algorithm is correct.
Now, since UBL stylesheets only do layout and no calculations, there
was never a problem with the reference numbers in the instance
document messing up arithmetic. And don't even try to validate an
instance document because of the format of reference
numbers: "!###!" (the bangs are there to distinguish two adjacent
reference numbers in the formatted result).
Also, since UBL documents are exclusively element content, my
stylesheets are not geared for mixed content.
So, given those limitations, I do not consider this environment as
general purpose. But it is incredibly useful for UBL documents and
will be useful for any element-content XML document. It isn't a
standard approach, but it is a useful approach.
I hope this is helpful.
. . . . . . . . . . . . Ken
--
Contact us for world-wide XML consulting and instructor-led training
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/
G. Ken Holman mailto:gkholman@CraneSoftwrights.com
Google+ profile: https://plus.google.com/116832879756988317389/about
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]