In this project, you need to implement a simple Information Retrieval system able to index a collection of documents, perform queries over it, and generate an output in the form of a ranked list of documents.

computer science

Description

Project 1

In this project, you need to implement a simple Information Retrieval system able to index a collection of documents, perform queries over it, and generate an output in the form of a ranked list of documents. As part of this project, you will conduct also a first evaluation of this system in order to assess its performance.

More specifically, the project will include the implementation of the following components:

 

Indexing.

Search and ranking. You need to implement three different retrieval models: Vector Space Model, BM25, and one Language Model of your choice.

Evaluation. You will build a pipeline for evaluating your system and compare the three different models that you have implemented. To generate the results of your experimental evaluation, you should use the TREC evaluation script ( trec_eval- 9.0.7.tar.gz ) that you can find h ere. A description of the metrics computed by this


script can be found h ere. Please, follow the instructions in the and run the script.

Data


file to compile


 

In this project, you will be working with the Cranfield collection, a small collection of 1,400 abstracts and 226 queries on the aviation domain.

The file                                                      contains the collection with the following files:

 

Instruction Files

Related Questions in computer science category