Lists Home |
Date Index |
One of the difficulties in considering factoring out functionally dependent entities from prose, is that the block of prose may itself not be worth reusing. That is, the prose may be a one-shot document whose original intent is simply to present information, not to act as a reliable container for access by clients with a variety of intents.
One thing I've done is to try to identify those concepts which are best understood, are most firmly established, and which serve as the focus of the stakeholders' activities and communications. Then design a profile document for each of these high-level concepts, which provide context for making pointers and for generating identifiers. The profiles are designed to provide some elements which are rigidly structured, and other elements which are prose with mixed content. In one case at least, this allowed me (with a stylesheet) to resolve most cross references internal to the document itself, minimizing calls to scan external documents. Also, depending upon the nature of your data and your validation techniques, you may be able to use the mixed content prose as the source of the definitive information, rather than just as glue.
It is certainly something a good CMS can help with, but I've also used DSSSL and XSLT/XPath for doing just this sort of thing with reasonable results. You might also want to check out DITA by Michael Priestley et al. of IBM, which I think intends to facilitate topical reuse.
Roger L. Costello wrote:
> Hi Folks,
> I am working with some people who wish to migrate from an
> all-prose format to a prose-plus-reusable-XML-fragments
> They have some data in prose that is useable in many contexts. They
> want to break out that reusable data into XML fragments. However,
> they want to continue to provide the prose style.
> For example, consider this prose data:
> <para>The city of Miami, Florida (pop. 1, 234,000) is a sprawling city
> with many attractions. Miami Beach is a popular attraction. The
> spring tide is ... The neap tide is ... </para>
> Examining this prose we can extract reusable info about the city of
> <City id="Miami">
> We can also extract reusable info about tide data on Miami Beach:
> <TideData id="MiamiBeachTides">
> The problem now is to create a framework which allows the prose
> to bring-together the independent, reusable XML components.
> Conceptually, what is desired is a "glue framework" like this:
> <para>The <ref href="Miami.xml"> is a sprawling city with
> many attractions. Miami Beach is a popular attraction. The
> tides are <ref href="MiamiBeachTides.xml"><para>
> Thus, the prose is "glueing" together the XML fragments.
> Is this a problem that you have experience with? What "glue
> framework" have you used? What strategy did you use to merge
> the XML fragments with the prose? Is there is a standard way
> of combining semi-structured data with structured data?
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> The list archives are at http://lists.xml.org/archives/xml-dev/
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>