OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: possibility of an RTF, LaTex XML conversion process (fwd)

[ Lists Home | Date Index | Thread Index ]
  • From: "Don Park" <donpark@quake.net>
  • To: <xml-dev@ic.ac.uk>
  • Date: Tue, 14 Apr 1998 15:30:52 -0700

>Amen!  I am up-converting a technical book in LaTeX that has literally
>thousands of format directives, each of which must be replaced by
>a descriptor showing the author's intent.  I used Perl to do some
>automatically, but about half needed decisions by a content expert.

My recommendation would be to do a dumb translation of LaTeX into XML.  By
doing so, you are deferring all the critical decisions which, if made
prematurely, could cause information loss and taint.

Once you have the XML-lized LaTeX document you have a core document to
create more application-oriented XML documents from.  For example, if you
are interested in duplicating the layout of the original LaTeX document, you
could extract the layout information and create a PGML document.  If you are
interested in an indexable XML document, you can extract the contents and
structural elements and massage them into an easily indexable format.

At later point, you can inject elements representing the author's intent as
well as some other content expert's interpretation (such element should have
an attribute indicating the point of view).


Don Park

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS