This is like a pattern recognition where you take the whole story of the article (can check News1.csv, column E, ‘content’). Then, the Features Extraction Programme (Python) can recognise and take only important keypoints.

computer science

Description

This is like a pattern recognition where you take the whole story of the article (can check News1.csv, column E, ‘content’). Then, the Features Extraction Programme (Python) can recognise and take only important keypoints. I have my codes (main.py) where it can only take: 1) Criminal Case [Example Theft, Homicide, Gambling, Smuggling etc in “crime word dictionary.txt”] 2) Location (in location.txt) 3) Date (Exact date of the crime happening based on the news) 4) Gender: Usually for Malay names When the name has ‘Bin’ means the suspect is a Male. When the name has ‘Bte’ means the suspect is a Female: Kasim bin Omar = Male Ahmad Razali bin Ismail = Male Siti Noraliza bte Haji Abdullah = Female 5) Fine being accuse: $ 500 or one month jail, $ 20 000 Brunei Dollar and 12 months jail etc.

1) As you can see, each article content the word “Homicide” but they were used for different meaning. Article A is only a description. Article B is only mentioning. Article C is the real situation where a person committed a crime homicide. What I need is only to extract the Homicide case on Article C where it is really happening unlike Article A and Article B. But for this one, you must use the file “News1.csv”, column E, ‘Content’” . I need you to make a pattern recognition which can be apply for all the content in the News1.csv where incase if like features a1 is happening, it only extract the article which is really happening. Basically, the features extraction programme (python) need to understand the situation of the article either it is really doing the crime (then extract this content) or the article only mentioning the crime which no crime being done here (then this one we don’t want to extract). This should done for all different crime types, theft, smuggling etc. IMPORTANT: If you can use NLTK for this in the code, that might be okay. 


Related Questions in computer science category