xml-dev - An Architecture for Limericks (was Limericks, Stupidity, and Reality)

An Architecture for Limericks (was Limericks, Stupidity, and Reality)

[ Lists Home | Date Index | Thread Index ]

To: Mike Champion <mc@xegesis.org>, xml-dev@lists.xml.org
Subject: An Architecture for Limericks (was Limericks, Stupidity, and Reality)
From: Jonathan Robie <jonathan.robie@softwareag.com>
Date: Sat, 12 Jan 2002 08:26:37 -0500
In-reply-to: <93HFEAJH1X76P869OISRNL4F086UO.3c3f85e0@MChamp>
References: <5.1.0.14.0.20020111174224.035b4290@157.189.161.214>

Computer scientists don't write limericks without designing a general 
architecture for limericks.

Suppose I wanted to write the limerick constraint testing system that Mike 
wants. I would probably want to separate concerns among several different 
modules, and I would like to make it easy to look at the results of each 
step. I think that I am unlikely to persuade my limerick authors to write 
their limericks in finely grained markup, so I suspect they will give me 
texts like this:

There was a young lady named Bright
Whose speed was much faster than light.
She set out one day,
in a relative way,
And returned on the previous night.

With a Perl script, it is fairly easy to mark up the lines of this poem, 
and I would like to do this before running my syllabification, because it 
is quite likely that the syllabification engine will lose my whitespace, 
which is important for identifying the lines. If I already know this is a 
limerick, I could choose to divide this up into long and short lines from 
the beginning:

<limerick>
   <long>There was a young lady named Bright</long>
   <long>Whose speed was much faster than light.</long>
   <short>She set out one day,/short>
   <short>in a relative way,</short>
   <long>And returned on the previous night.</long>
</limerick>

For testing the rhyme scheme, further markup is probably not helpful. Also, 
I am probably not going to keep any markup I use for testing whether a line 
scans, so I will use a schema adjunct to declare the constraints on this 
poem. I use standard poetry terms for the metrical feet - here is a table 
of the terms, compared with their representation in Cowan Normal Form (CWF):

         iamb            da-DUM
         anapest         da-da-DUM
         tertius paeon da-da-DUM-da

Here is a Schema Adjunct that declares the constraints on a limerick:

<schema-adjunct targetNamespace="http://www.example.com/limerick";
                 xmlns="http://www.schema-adjuncts.org/namespaces/2001/07/saf";>

     <global>
         <rhymes>
                 <line select="limerick/long[1]" />
                 <line select="limerick/long[2]" />
                 <line select="limerick/long[3]" />
         </rhymes>
         <rhymes>
                 <line select="limerick/short[1]" />
                 <line select="limerick/short[2]" />
         </rhymes>
     </global>

     <element context="short">
         <scans>
             <sequence>
                 <choice>
                    <iamb />   <!-- da dum -->
                    <anapest /> <!-- da da dum -->
                 </choice>
                 <choice>
                    <iamb />
                    <anapest />
                 </choice>
             </sequence>
         </scans>
     </element>

     <element context="long">
         <scans>
             <sequence>
                 <choice>
                    <iamb />
                    <anapest />
                 </choice>
                 <choice>
                    <iamb />
                    <anapest />
                 </choice>
                 <choice>
                    <iamb />
                    <anapest />
                    <tertius.paeon /> <!-- da da dum da -->
                 </choice>
             </sequence>
         </scans>
     </element>

</schema-adjunct>


So far, I have written no code, so I have no software that will tell me 
whether a line scans or whether a set of lines rhyme. However, I do have a 
way of declaring the structure of a poem in a Schema Adjunct, and I can use 
it to describe the structure of other kinds of poems as well. The specific 
algorithms for testing these constraints is up to the implementations, but 
I have also modularized the implementation.

I have also made the implementation easier to test - I can write test 
suites that take sets of words that are presumed to rhyme or not to rhyme, 
and see whether my system handles them correctly. I can do the same for 
scansion.

Now suppose that more than one rhyming engine exists, and more than one 
scansion engine exists. Do these engines agree? If not, how do they 
disagree? Are there bugs in one or both of the engines, or are their 
answers both reasonable? If the answers to these questions are important to 
me, a concrete representation of the output of the engines may be very 
helpful. Without it, I would have to compare the source code of the 
systems, or try to create exhaustive sets of tests that would give me 
indications of how they work.

For instance, suppose I ask the software to test whether the following scans:

  <long>There was a young lady named Bright</long>

If it says that it does not, I may not be sure whether there is a bug in my 
software or an error in the line of the limerick. If there is a bug in my 
software, I may not know if the bug is in the syllabification per se, in 
the stress assigned to syllables, or in the comparison of the 
syllabification and stress to that required of a long line in a limerick. 
For testing purposes, output like the following can be very helpful indeed:

         <long>
                 <da>There</da>
                 <dum>was</dum>
                 <da>a</da>
                 <da>young</da>
                 <dum>la</dum>
                 <da>dy</da>
                 <da>named</da>
                 <dum>Bright</dum>
         </long>

Not only is this useful for testing, it is also useful for defining 
interfaces. For instance, I might well have a system that takes the above 
representation and compares it to the declared scansion for the long line 
of a limerick, as given in the above schema adjunct. This would be very 
simple to write.

In general, when designing complex systems, I think it is very helpful to 
think in terms of declarative, testable architectures.

Jonathan

Follow-Ups:
- Re: An Architecture for Limericks
  - From: Mike Champion <mc@xegesis.org>

References:
- Re: [xml-dev] Re:[xml-dev] [ANN] XML Limerick Competition
  - From: Jonathan Robie <jonathan.robie@softwareag.com>
- Limericks, Stupidity, and Reality (was Re:[xml-dev] [ANN] XML LimerickCompetition)
  - From: Mike Champion <mc@xegesis.org>

Prev by Date: Re: [xml-dev] Limericks, Stupidity, and Reality (was Re:[xml-dev] [ANN] XML Limerick Competition)
Next by Date: Re: [xml-dev] OT: Productivity (was RE: [xml-dev] Re:[xml-dev] [ANN] XML Limerick Competition)
Previous by thread: Re: [xml-dev] Limericks, Stupidity, and Reality (was Re:[xml-dev] [ANN] XML Limerick Competition)
Next by thread: Re: An Architecture for Limericks
Index(es):
- Date
- Thread