Here are three random things which may be useful
to consider.
1) The first is that DSSSL allows you to have external
functions. So even though DSSSL itself has no way to query the pagination
system, DSSSL does allow you to stick in your own queries or functions. You can
do all sorts of tricks with these. I dont know to what extent JADE supports
this, though. One trouble with stream-based SGML processors is that they often
have an output buffer (or are in a pipe) so unless you can flush the output
buffers, your SGML processor may be left stranded if it waits for some feedback
from a downstream program.
A DSSSL system built on top of a general purpose
Scheme would be most likely to cope with feedback from layout engines.
Tony Graham of the DSSSL list would be a good contact in this
regard.
2) People often put pagination information in processing
instructions. Or the information can be kept in an external database with,
for example, HyTime locators. If you can decide in advance to only break pages
on paragraph boundaries, then you can piggyback the pagination information on
top of element markup.
3) If you find you have many of these concurrent structures,
you may opt for "point markup", which is rather extreme, and would be
an interesting challenge for some stream-based processors. In point markup, your
main text is just marked up using
<!DOCTYPE document
[
<!ELEMENT text ( #PCDATA |
point)*>
<!ELEMENT point EMPTY>
<!ATTLIST point id ID #REQUIRED
>
Then you have as separate element trees for each
kind of structure: these trees probably contain no character data of their own,
just IDREFs to the start and end of their range. In this way you can
represent concurrent, overlapping hierarchies in SGML. For example:
<!ELEMENT document (tree+, text)>
<!ELEMENT tree (start,
tree*, end)>
<!ELEMENT ( start |
end ) EMPTY >
<!ATTLIST
tree type
NMTOKEN #IMPLIED >
<!ATTLIST (start | end ) refid IDREF
#REQUIRED >
]>
<document>
<tree name="pages">
<start refid="x1"/>
<tree name="page1">
<start refid="x1"/>
<end refid="x4"/>
</tree>
<tree name="page2">
<start refid="x4"/>
<end refid="x5"/>
</tree>
<end
refid="x5"/>
</tree>
<tree name="p">
<start refid="x2"/>
<tree name="b">
<start refid="x3"/>
<end refid="x5"/>
</tree>
<end refid="x5"/>
</tree>
<text><point id="x1"/>here is <point
id="x2"/>some<point id="x3"/>
data <point id="x4">of no interest.<point
id="x5"/></text>
</document>
This structure has the advantage of neatness, and provides a
lot of modeling power
for just one extra level of indirection. If you used HREF
rather than REFID, you can use
external point markup too.
The effect, of course, is to have
concurrently
<pages><page1>here is some
data </page1><page2>of no
interest.</page2></pages>
and
<p>here is
<b>some</b>
data of no interest.</p>
Rick Jelliffe
Author, "The XML & SGML Cookbook", out in May
from Prentice Hall.
|