Welcome to the Storm machine practice. The final goal of this assignment is to build a system that finds the top N words in a given corpus. We will build the assignment step by step on top of the system in the tutorial.

computer science

Description

1 Overview 

Welcome to the Storm machine practice. The final goal of this assignment is to build a system that finds the top N words in a given corpus. We will build the assignment step by step on top of the system in the tutorial.


Exercise A - Build a Storm topology to count words from a spout that randomly generates sentences 


Exercise B - Modify topology from Exercise A to get the input sentences from a file 


Exercise C - Add a normalizer in topology from exercise B to make input words lowercase and remove common english words 


Exercise D - Obtain and store the top N words, based on occurrence frequency, from the topology in exercise C 

   


Related Questions in computer science category