[
Lists Home |
Date Index |
Thread Index
]
- To: Betty Harvey <harvey@eccnet.com>
- Subject: Re: [xml-dev] text to XML
- From: Nicolas Toper <ntoper@jouve.fr>
- Date: Fri, 30 Jan 2004 19:04:00 +0100
- Cc: xml-dev@lists.xml.org
- In-reply-to: <Pine.LNX.4.44.0401301253590.29953-100000@eccnet.eccnet.com>
- Organization: Jouve
- References: <Pine.LNX.4.44.0401301253590.29953-100000@eccnet.eccnet.com>
- Reply-to: ntoper@jouve.fr
- User-agent: KMail/1.4.1
Well, it's more a proof of concept than a commercial application. I used it in
my company before it closed to update a magazine website. We had only the
(text) PDF files.
Basically, I used some artificial learning technology + expert system to
extract the semantic of the text and XMLize it. Then I put it on the
Webserver and the update was done b/c I had some XSL behind to work for me
:=)
It's written in Jython + some Java parts and works well.
I'm right now in the last month of my employment. I think I'll work on it
during my unemployment period but I'm also wondering if ppl would be
interested in something like that?
nicolas
PS If you want more informations, please send me a mail to ntoper@yahoo.fr
since this e-mail adress won't be valid in a copple of days/
Le Vendredi 30 Janvier 2004 19:01, vous avez écrit :
> On Fri, 30 Jan 2004, Nicolas Toper wrote:
> > Why don't you use bayesian filters technology based?
>
> My spam filter (spamassassin) uses Bayesian logic for filtering e-mail but
> I am not sure how Bayesian logic can be used for conversion with free
> text. I would be interested to understand how Bayesian logic is
> used for conversion.
>
> Betty
>
> > Le Vendredi 30 Janvier 2004 18:43, Betty Harvey a écrit :
> > > I am surprised no one has mentioned Omnimark yet! I also use
> > > InfinityLoop to convert from Word or RTF to XML based on the styles,
> > > then XSLT for final conversion.
> > >
> > > Betty
> > >
> > > On Fri, 30 Jan 2004, Mike Fitzgerald wrote:
> > > > I've asked this question here before, but I need a refresh. I know
> > > > you can use xmlspy, xmlLinguist, and LTE.exe (xmlLinguist on the
> > > > command line) to convert text files (not CSV) to XML. Is anybody
> > > > using anything else?
> > > >
> > > > Mike
|