In this project, you need to implement a simple
Information Retrieval system able to index a collection of documents, perform
queries over it, and generate an output in the form of a ranked list of
documents. As part of this project, you will conduct also a first evaluation of
this system in order to assess its performance.
More specifically, the project will include the
implementation of the following components:
Indexing.
Search and ranking.
You need
to
implement three different retrieval models: Vector
Space Model, BM25, and one Language Model of
your choice.
Evaluation. You will build a pipeline for evaluating your system and compare the three different models that you have implemented. To generate the results of your experimental evaluation, you
should use the TREC evaluation script ( trec_eval- 9.0.7.tar.gz
) that you can find h ere. A description of the metrics
computed by this
script can be found h ere. Please, follow the
instructions in the and run the script.
file to compile
In this project,
you will be working with the Cranfield
collection, a small collection of 1,400 abstracts and 226 queries on the
aviation domain.
The file contains
the collection with the following files:
Get Free Quote!
439 Experts Online