Similarity thesauri

This code was used to produce examples for my talk Similarity thesauri and cross-language retrieval. More information about this talk is on my studies page; a handout is also available.

The example code was written in Python; it requires Python version 2.2 or higher.

This file contains the following classes:

This class provides the link between items and features.
This is the super class for all objects that can be decomposed into tokens.
The simplest kind of document, consisting of just a string
This class is derived from UserDict and implements the subsumption order on feature structures as operators <= and >=.
Documents can have additional properties, for example their language. Furthermore, they can be composed of other documents.
This is the common super class of Item and Feature. The constructor takes a Properties object as argument and either returns a previously constructed object with the same properties, or constructs a new one.
This class was derived from IndexComp without any changes.
This class was derived from IndexComp without any changes.
This class provides the basic functions for IR systems, for example weighting methods and storage of items and features.
This class implements the construction of a similarity thesaurus as described in the handout.
A class implementing a cross-language similarity thesaurus. This changes only the output functions.

Most classes contain methods .asTeX and .asMP that produce TeX and MetaPost snippets describing the object.

This file contains the documents used to construct the examples in the handout.

This file constructs two similarity thesaury from the documents in and writes the corresponding TeX and Metapost snippets to files.

Copyright © 1999--2004 Sebastian Marius Kirsch , all rights reserved.
Id: index.wml,v 1.3 2004/05/26 10:05:29 skirsch Exp