OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   FWD: Announcement - World Wide Web Wrapper Factory (W4F)

[ Lists Home | Date Index | Thread Index ]
  • From: "John E. Simpson" <simpson@polaris.net>
  • To: xml-dev@ic.ac.uk, XML-L@listserv.heanet.ie
  • Date: Tue, 23 Mar 1999 19:48:55 -0500

I received this announcement via e-mail yesterday. It may (or may not :) be
of interest to xml-dev and xml-l subscribers. Contact information is at the
foot of the announcement.

[Disclaimer: I have no affiliation with the W4F product development group.
My correspondent, previously unknown to me, just happened on my website.
Apologies for the cross-posting to subscribers of both lists.]

>----- Looking at the Web through XML glasses, using W4F -----
>
>The World Wide Web Wrapper Factory (W4F) is a Java toolkit to
>generate wrappers for HTML data sources.
>
>Version 1.03 offers a built-in declarative mapping to XML.
>Using W4F it is now possible to easily specify the translation 
>of HTML pages into XML documents. Moreover, the specification 
>gives for free the DTD.
>
>W4F consists of a retrieval language to identify Web sources, a
>declarative extraction language (HEL: HTML Extraction Language) 
>to express robust extraction rules and a mapping interface to 
>export the extracted information into some user-defined data-
>structures (text, Java objects, XML, etc.).
>The wrappers are generated as Java classes that can be used as is 
>or integrated into higher-level applications.
>
>Version 1.03 provides some improved visual support to make the
>creation of wrappers easier and faster. In particular, the 
>extraction of HTML can be done via a wysiwyg interface.
>
>The W4F toolkit comes as a Java package and can be downloaded from 
>the W4F web site. It is free for non-commercial use.
>Various examples of running wrappers are also available for download
>from the web site.
>
>Web site:
>http://db.cis.upenn.edu/W4F
>
>Contacts:
>Arnaud Sahuguet
>Database Research Group, Univ. of Pennsylvania, PA, USA
>sahuguet@gradient.cis.upenn.edu
>http://www.cis.upenn.edu/~sahuguet
>
>Fabien Azavant
>École Nationale Supérieure des Télécommunications, Paris, France
>Fabien.Azavant@enst.fr
>http://www.stud.enst.fr/~azavant

==========================================================
John E. Simpson            | The secret of eternal youth
simpson@polaris.net        | is arrested development.
http://www.flixml.org      |  -- Alice Roosevelt Longworth

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS