OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Re: Architectural Forms and XAF

[ Lists Home | Date Index | Thread Index ]
  • From: "W. Eliot Kimber" <eliot@isogen.com>
  • To: Sean McGrath <digitome@iol.ie>
  • Date: Sun, 05 Mar 2000 19:16:53 -0600

Sean McGrath wrote:
> 
> [Steve Newcomb on AFs]
> 
> >When looking at such an element, select the attribute that
> >corresponds to the meta DTD you're interested in interpreting the
> >element as conforming to; the value of that attribute is the tagname
> >for all purposes of that meta DTD.)
> >
> 
> In other words, (as I have said before, years ago
> on comp.text.sgml), there
> is nothing "meta" about meta DTDs.
> They are essentially, a device for
> simple one-to-one mappings of element
> type name to element type name with some
> extra bells on.
> 
> In my world, this facility fails the "so what"
> test.
> 
> Oh how I wish the real world were so simple,
> that a basic mapping system sufficed to
> get real work done.

Then you are missing the point of architectures. The point is not to
enable transformations (although they can be used that way in some
cases). The point is to enable a document to *declare* what higher-order
things it conforms to in order to enable validation against those
higher-order things (e.g., I am a valid HyTime document and you can
prove it) and to enable processors to easily *recognize* that certain
elements and attributes are mapped to semantic constructs they know how
to process. That is, it lets you capture knowledge about how a document
relates to some more general taxonomy of types in the document itself
rather than in the code that processes the document. That makes the
knowledge generally available to *any* processor, rather than only to
the code that embodies the knowledge.

If your XML work mostly involves applying to specialized processing to
documents over which you have complete control, then architectures may
add little additional value. But if you are applying processing to a
large set of documents over which you have little or no control (e.g.,
the generic HyTime or XLink or Topic Map scenario) then architectures
are the *only possible solution* that I've been able to think of.
However, my experience is that there are few non-trivial XML problems
that cannot be solved more effectively by using architectures *in
addition to* our traditional set of tools.

Architectures also provide an effective way to formally capture design
knowledge at different levels of specificity, which is vitally important
when building up systems of related document types within an enterprise
or industry.

Also, when I teach architectures I try to be clear that architectures do
not, as Sean points out, solve every processing problem. Far from it,
and for those problems something like XSLT is the ideal solution. But
there is a significant class of problems for which architectures are
either just the right solution or an important and cost-effective
component of a larger solution.

Here is an immediate and practical example from my current project with
the State of Texas: One of our challenges is managing the markup for the
daily House and Senate journals (the Texas Legislature is a traditional
American bicameral body). On their face journals are very simple, but,
we discovered that in fact they are quite complex *if* you try to
capture the details about each different legislative order of business
described in the journal. We decided we did want to do this in order to
enable the automatic generation of indexes and to help drive authoring.
The problematic side effect of this is that each House and Senate DTD
has about 740 element types: 200+ types for each distinct order of
business (OOB), plus two unique subelements for each OOB (plus assorted
other stuff to fill out the other 100 or so types). 

Now, imagine the practical problem of writing a complete style sheet for
these DTDs: no less than 740 separate element in context rules, likely
two or three times that because of contexts. Such a style sheet becomes
almost impossible to create and manage.

But...

If you generalize up one level, you discover that there are in fact only
about 25 distinct *types* of element types. These 25 types also map
pretty much one-to-one to the formatting distinctions needed to render
journals. Hmmm.

If we capture these 25 types as an architecture, we get the following:

1. A clear, formal definition of these essential differences,
uncluttered by the noise of the 740+ specialized element types.
2. A simple syntactic structure to hook generalized processing to (an
attribute whose value will be one of the 25 types)
3. A clear definition, *within the journal DTDs*, of the basic type each
specialized type maps to, making the DTD more self describing and making
the knowledge of the binding generally available.

So we did this. After adding the "journal architecture" mapping to the
journal DTDs, it became possible to write simple programs like the one
shown below, which generates an ADEPT*Editor style sheet (FOSI). This
program is trivially easy to maintain, in sharp contrast to the
alternative described above. The indirection of the architecture, which,
as Sean says, really isn't very involved, gives us a huge amount of
leverage. I would estimate that adding this architecture to the DTDs has
already saved me personally 40-80 hours in time lost due to the sheer
number of element types in these two DTDs.

The Python program that generates the FOSI is shown below in its
entirety (it uses GroveMinder to access the DTD, but you could use any
similar mechanism for accessing the DTD properties). Note that most of
the script is the text of the output FOSI, not the code that implements
the mapping. Note too how easy it is to detect and act on the
architectural mapping.

#---------------------------------------------------------------------------
# Make Journal FOSI (SALSA Project)
#
# C*pyright (c) 1998, ISOGEN International Corp.
#
# Author: W. Eliot Kimber
#
# $Id: mkjournalfosi.py,v 1.3 2000/02/14 23:24:46 eliot Exp $
#
# This software may be used for any purpose without restriction.
#
# Given an input journal document, generates a FOSI that reflects
# every element type and sets the appropriate default values. The
generated
# FOSI may need further refinement to be completely usable.
#
# Arguments:
#  document_filename  -- Name of SGML document you want to generate a
FOSI for
#
# $Log: mkjournalfosi.py,v $
#
#---------------------------------------------------------------------------

import sys, string
import gmsdql
from GroveMinder import SystemError, NotInClassError
from string import ljust

def arch_form_of_type(et, archname):
    "Returns the architectural form of the type"
    form = None

    if hasattr(et, "AttributeDefs"):
        attdefs = et.AttributeDefs
        try:
            archatt = attdefs[archname]
            if archatt.data() != "":
                form = archatt.data()
        except KeyError:
            pass
    else:
        print "No attribute definitions"

    return form

def process_elemtypes(doctype, outfile):
    "Iterates over the element types and generates EICs as appropriate"

    elemtypes = doctype.ElementTypes

    # Process element types in alphabetical order

    names = elemtypes.keys()
    names.sort()

    for name in names:
        et = elemtypes[name]
        gi = string.lower(et.Gi)
        form = arch_form_of_type(et, "JOURNAL-ARCH")

        # EICs for default context:

        outfile.write("<e-i-c gi='%s'>\n" % gi)
        outfile.write(" <charlist>\n")

        if form == "oob":
            outfile.write("<!-- oob -->\n")
            outfile.write('  <usetext source="\\%s:\\"
placemnt="before">\n' % gi)
            outfile.write('   <subchars>\n')
            outfile.write('    <font weight="bold">')
            outfile.write('    <presp minimum="10" nominal="12"
maximum="14">\n')
            outfile.write('    <postsp minimum="10" nominal="12"
maximum="14">\n')
            outfile.write('    <textbrk startln="1" endln="1">\n')
            outfile.write('   </subchars>\n')
            outfile.write('  </usetext>\n')
        elif form == "params":
            outfile.write("<!-- params -->\n")
            outfile.write('  <highlt fontclr="green">')
            outfile.write('  <textbrk startln="1" endln="1">\n')
            outfile.write('  <usetext source="\\Parameters:\\"
placemnt="before">\n')
            outfile.write('   <subchars>\n')
            outfile.write('    <textbrk startln="1" endln="1">\n')
            outfile.write('   </subchars>\n')
            outfile.write('  </usetext>\n')
            outfile.write('  <usetext source="\\=================\\"
placemnt="after">\n')
            outfile.write('   <subchars>\n')
            outfile.write('    <textbrk startln="1" endln="1">\n')
            outfile.write('   </subchars>\n')
            outfile.write('  </usetext>\n')
        elif form == "disposition":
            outfile.write("<!-- disposition -->\n")
            outfile.write('  <indent firstln="0.5in">\n')
            outfile.write('  <quadding inherit="1">\n')
            outfile.write('  <presp minimum="10" nominal="12"
maximum="14">\n')
            outfile.write('  <textbrk startln="1" endln="1">\n')
        elif form == "header":
            outfile.write("<!-- header -->\n")
            outfile.write('  <font weight="bold">')
            outfile.write('  <quadding quad="center">\n')
            outfile.write('  <presp minimum="10" nominal="12"
maximum="14">\n')
            outfile.write('  <postsp minimum="10" nominal="12"
maximum="14">\n')
            outfile.write('  <textbrk startln="1" endln="1">\n')
        elif form in ("para", "journal.para"):
            outfile.write("<!-- para -->\n")
            outfile.write('  <indent firstln="0.5in">\n')
            outfile.write('  <quadding inherit="1">\n')
            outfile.write('  <presp minimum="10" nominal="12"
maximum="14">\n')
            outfile.write('  <textbrk startln="1" endln="1">\n')
        elif form == "line":
            outfile.write("<!-- line -->\n")
            outfile.write('  <font inherit="1">\n')
            outfile.write('  <quadding inherit="1">\n')
            outfile.write('  <textbrk startln="1" endln="1">\n')
        elif form == "parameter":
            # Parameters outside of paragraphs or other contexts
            # (e.g., within param.* elements or parameter groups)
            outfile.write("<!-- parameter -->\n")
            # Special case for specific parameters:
            outfile.write('  <font inherit="1">\n')
            outfile.write('  <indent firstln="0.5in">\n')
            outfile.write('  <quadding inherit="1">\n')
            outfile.write('  <highlt inherit="1">\n')
            outfile.write('  <textbrk startln="1" endln="1">\n')
            outfile.write('  <usetext source="\\%s:[\\"
placemnt="before"></usetext>\n' % gi)
            outfile.write('  <usetext source="\\]\\"
placemnt="after"></usetext>\n')
        elif form in ("ubr", "param.ref"):
            outfile.write("<!-- ubr -->\n")
            outfile.write('  <font inherit="1">\n')
            if gi == "reso.ubr":
                outfile.write('  <presp minimum="10" nominal="12"
maximum="14">\n')
                outfile.write('  <postsp minimum="10" nominal="12"
maximum="14">\n')
                outfile.write('  <textbrk startln="1" endln="1">\n')
        elif form == "legal.text":
            outfile.write("<!-- legal.text -->\n")
            if gi == "added.text":
                outfile.write('  <font inherit="1">\n')
                outfile.write('  <highlt scoring="1" scorespc="0">\n')
            elif gi == "deleted.text":
                outfile.write('  <font inherit="1">\n')
                outfile.write('  <highlt fontclr="red">\n')
            else :
                outfile.write('  <font inherit="1">\n')
        else:
            outfile.write('  <font inherit="1">\n')
            outfile.write('  <quadding inherit="1">\n')
            outfile.write('  <highlt inherit="1">\n')

        outfile.write(" </charlist>\n")

        # Attribute-based rules.
        if form in ("ubr", "param.ref"):
            outfile.write('<!-- ubr -->\n')
            outfile.write(' <att>\n')
            outfile.write('  <fillval attname="generated-text"
fillcat="usetext" fillchar="source">\n')
            outfile.write('  <charsubset>\n')
            outfile.write('   <usetext></usetext>\n')
            outfile.write('  </charsubset>\n')
            outfile.write(' </att>\n')

        outfile.write("</e-i-c>\n")

        # EICs for specific contexts:

        # In the rendition variant, parameters can occur in "paragraph"
contexts,
        # so we have to give them different formatting in those
contexts.

        if form in ("parameter", "parameter.group"):
            #
            # journal.para
            #
            outfile.write('<e-i-c gi="%s" context="journal.para">\n' %
gi)
            outfile.write(' <charlist>\n')
            # Special case for specific parameters:
            if gi in ("reso.type", "reso.number", "bill.type",
"bill.number", "bill", "reso"):
                outfile.write('  <font weight="bold" inherit="1">\n')
            else:
                outfile.write('  <font inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            #
            # param.line
            #
            outfile.write('<e-i-c gi="%s" context="param.line">\n' % gi)
            outfile.write(' <charlist>\n')
            # Special case for specific parameters:
            if gi in ("reso.type", "reso.number", "bill.type",
"bill.number", "bill", "reso"):
                outfile.write('  <font weight="bold" inherit="1">\n')
            else:
                outfile.write('  <font inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            #
            # para (simple para)
            #
            outfile.write('<e-i-c gi="%s" context="para">\n' % gi)
            outfile.write(' <charlist>\n')
            # Special case for specific parameters:
            if gi in ("reso.type", "reso.number", "bill.type",
"bill.number", "bill", "reso"):
                outfile.write('  <font weight="bold" inherit="1">\n')
            else:
                outfile.write('  <font inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')


            #
            # listing.vr (listing for a vote record)
            #
            outfile.write('<e-i-c gi="%s" context="listing.vr">\n' % gi)
            outfile.write(' <charlist>\n')
            # Special case for specific parameters:
            if gi in ("reso.type", "reso.number", "bill.type",
"bill.number", "bill", "reso"):
                outfile.write('  <font weight="bold" inherit="1">\n')
            else:
                outfile.write('  <font inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')


        # Need to acount for parameters within person.name,
adjourn.time, and resume.time

        # Dispositions are another unique context.
        elif form == "disposition":
            outfile.write('<e-i-c gi="outcome" context="%s">\n' % gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            outfile.write('<e-i-c gi="bill" context="%s">\n' % gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font weight="bold" inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            outfile.write('<e-i-c gi="bill.type" context="bill %s">\n' %
gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font weight="bold" inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            outfile.write('<e-i-c gi="bill.number" context="bill %s">\n'
% gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font weight="bold" inherit="1">\n')
            outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            outfile.write('<e-i-c gi="reso" context="%s">\n' % gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font weight="bold" inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            outfile.write('<e-i-c gi="reso.type" context="reso %s">\n' %
gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font weight="bold" inherit="1">\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

            outfile.write('<e-i-c gi="reso.number" context="reso %s">\n'
% gi)
            outfile.write(' <charlist>\n')
            outfile.write('  <font weight="bold" inherit="1">\n')
            outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
            outfile.write(' </charlist>\n')
            outfile.write('</e-i-c>\n')

    # Special case E-I-Cs that are not based on architectural mapping go
here:

    outfile.write('<e-i-c gi="bill.type" context="bill para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="bill.type" context="bill
journal.para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="bill.type" context="bill param.line">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="bill.number" context="bill para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="bill.number" context="bill
journal.para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="bill.number" context="bill
param.line">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="reso.type" context="reso para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="reso.type" context="reso
journal.para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="reso.type" context="reso param.line">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="reso.number" context="reso para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="reso.number" context="reso
journal.para">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="reso.number" context="reso
param.line">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    # person.name elements:

    outfile.write('<e-i-c gi="first.name" context="person.name">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="middle.name" context="person.name">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="last.name" context="person.name">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="honorific" context="person.name">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="after"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="nickname" context="person.name">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write("  <usetext source='\\" + '"' + " \\'
placemnt='before'></usetext>\n")
    outfile.write("  <usetext source='\\" + '"' + " \\'
placemnt='after'></usetext>\n")
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')

    outfile.write('<e-i-c gi="title" context="person.name">\n')
    outfile.write(' <charlist>\n')
    outfile.write('  <font inherit="1">\n')
    outfile.write('  <usetext source="\\ \\"
placemnt="before"></usetext>\n')
    outfile.write(' </charlist>\n')
    outfile.write('</e-i-c>\n')


#---------------------------
def main(argv):
    if len(argv) < 2:
            print "Usage: mkfosi.py filename.sgm"
            sys.exit(1)

    grove = gmsdql.construct_SGML_grove(sys.argv[1])
    doctype = grove.GoverningDoctype
    if doctype:
        outfile = open("newfosi.fos", "w")
        outfile.write("<!-- Generated FOSI for document %s -->\n" %
sys.argv[1])

        process_elemtypes(doctype, outfile)

        outfile.write("<!-- End of FOSI  -->\n")
    sys.exit(0)

if __name__ == "__main__":
    main(sys.argv)
#---- End of script ---

I could have done the same thing by hard-coding the mapping to the
journal architecture in the script (which is the implication of Sean's
approach), but that would have bound the knowledge that mapping
represents into the script, rather than making it generally available to
any processor, which is what the architectural mapping does. 

It's really just a question of where you bind your knowledge about the
data: in the code that processes it or in the data itself. I always
prefer to bind my knowledge to the data.

Cheers,

E.

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************




 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS