[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Doing large scale XML processing/transformation?
- From: "Betty L. Harvey" <firstname.lastname@example.org>
- To: Martin Skøtt <email@example.com>
- Date: Sat, 25 Aug 2001 19:40:40 -0400 (EDT)
Look at Omnimark. Omnimark is designed to work with structured
information. Most organizations that are doing massive conversion of
information either to or from SGML use Omnimark. It is an easy
language to work with. It also runs on LINUX. The URL for
Omnimark is http://www.omnimark.com. Omnimark has a proven track
record in conversions.
Hope this helps.
Betty Harvey | Phone: 410-787-9200 FAX: 9830
Electronic Commerce Connection, Inc. |
firstname.lastname@example.org | Washington,DC SGML/XML Users Grp
URL: http://www.eccnet.com | http://www.eccnet.com/xmlug/
On Sun, 26 Aug 2001, Martin Skøtt wrote:
> On Sun, Aug 26, 2001 at 12:58:39AM +0200, wrote:
> > I have a large amount (120GB) of data stored in various home made SGML formats
> > and a few non SGML formats. I need a way of transforming these into multiple
> > XML documents conforming to different DTD's.
> > Doing these transformations takes quite some time so I'm investegating various
> > methods of doing this while still keeping the investements at a reasonable
> > level. I have this idea of a way to solve the problem:
> > I imagine doing it in distributed fashion along the lines of what Seti@home
> > and distributed.net are doing, but without the community bit :-) I would then
> > setup a "cluster" of cheap off the shelf PC's propably running Linux and let
> > them work through my data. Another very important thing is some method of
> > reusing code in order to keep the development time and expenses minimal. I
> > think that splitting the conversion job into multipe steps and then letting
> > each step be an individual program. I think the idea of Unix style pipes is
> > the best analogy to what I'm thinking of. This would allow me to taylor a line
> > of conversion steps suitable for the individual source and target formats.
> Ooops I missed parts of the message here is the rest, sorry :-)
> <rest of message>
> This would allow me to tailor a line of conversion steps suitable for the
> individual source and target formats reusing the most common bits of code.
> The language is pretty much irrelevant although I would prefer something more
> like Perl or Python than C or C++, but thats just for ease of development :-)
> Does anyone know a tool like this? I could start doing this myself, but I would
> rather prefer an already finished product so I could get working on the
> conversions right away.
> Most of the sites I have been looking at seem pretty keen on XSLT, but I kind
> of have this feeling that I need more than just XSLT for this job ;-)
> </rest of message>