[
Lists Home |
Date Index |
Thread Index
]
I've released an initial version of Ool, a set of Java SAX filters for
working with out-of-line markup in XML.
http://simonstl.com/projects/ool/
Ool can separate XML documents into files containing their markup and
their (element) text, using ool:text elements to indicate which part of
the text file goes where in the document. It can also reconstruct
documents from the separate markup and text files. (The incomplete
portion optimizes the representation of where the text goes, but I've
not had time...)
Ool is not (yet) a means for realizing the hypertext dreams of Ted
"Embedded Markup Considered Harmful" Nelson, but it may be a framework
for experimentation on such things.
The filter for recombining text and markup is also handy for textual
inclusions whether or not they came from the separator - effectively
it's a text-only includer which uses character locations for start and
end-points.
As a bit of a bonus, there's also a SAX filter which abolishes mixed
content through the handy expedient of wrapping the odd bits of text
(those which have element siblings) in elements conveniently named
"mix:ed". This may prove useful for constraining mixed content using
DTDs and W3C XML Schema in ways which are not presently possible. (RELAX
NG needs no such assistance.)
I do plan to work on this further, and port it to MOE (where many of the
optimizations should be much simpler), but it'll be a little while. I'm
editing too much to get time to focus on programming.
--
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com
|