Publications
Authors:
  • Boris Chidlovskii , Jon Ragetli , Maarten de Rijke
Citation:
11th European Conference on Machine Learning, Barcelona, Spain, May 2000
Abstract:
To facilitate effective search on the World Wide Web, several so-called `meta search engines' have been
developed which do not search the Web themselves, but use available search engines to find the required
information. By means of wrappers meta search engines retrieve relevant information from the HTML pages
returned by search engines. In this paper we present an approach to create such wrappers automatically by
means of an incremental grammar induction algorithm. The algorithm is based on the notion of "string edit
distance". Our method performs well; it is quick, it can be used for several types of pages and it requires a
minimal amount of interaction with the user.
Year:
2000
Report number:
2000/202
Attachments: