This assignment will help you to explore and analyse a set of data and reconstruct it into meaningful representations for decision making.

computer science

Description

1.0              OBJECTIVES OF THIS COURSEWORK

 

This assignment will help you to explore and analyse a set of data and reconstruct it into meaningful representations for decision making.

 

 

2.0              INDIVIDUAL ASSIGNMENT DESCRIPTION

 

A DATA ANALYSIS PROJECT USING HOURLY WEATHER DATA

 

This assignment needs to explore hourly weather data set and categorize it by different techniques in such a way that it should retrieve the necessary information which helps to do a decision making. Your analysis should be deep and in detail, also it must go further than what has already been covered in this course.

 

You have to import the data then do the necessary pre-processing on the dataset, use the necessary commands to convert it into the desired format. You have to apply the data visualization, exploration, and manipulation techniques in your project. It is very important to explain and justify the techniques that have been chosen. Outline the findings, analyse them, and justify correctly with appropriate graphs. Also, a supporting document is needed to reflect the graph and code using R programming concepts. Additional features must explore further concepts that can improve retrieval effects.

 

The dataset provided for this assignment is related to the hourly meteorological data for LaGuardia Airport (LGA) and John F. Kennedy International Airport (JFK) in the United States. It contains 15 columns and 17,412 rows. The columns with the description are given in the table below.


Table 1. Dataset columns description.

Column(s)

Description

origin

Weather station.

year, month, day, hour

Time of recording.

temp, dewp

Temperature and dewpoint in F.

humid

Relative humidity.

wind_dir, wind_speed, wind_gust

Wind direction (in degrees), speed and gust speed (in mph).

precip

Precipitation, in inches.

pressure

Sea level pressure in millibars.

visib

Visibility in miles.

time_hour

Date and hour of the recording as a POSIXct date.

 

 

 

3.0              GENERAL REQUIREMENTS

·         The program submitted should compile and be executed without errors.

·         Validation should be done for each entry from the users to avoid logical errors.

·         No duplication is allowed in dataset.

·         To score Pass: you need to include 7 analysis examples covering data visualization, exploration, and manipulation topics. Please refer to the Marking Scheme for more details.

·         To score Credit: you need to include 11 analysis examples covering data visualization, exploration, and manipulation. In addition, including at least 1 additional feature which can improve the results which is apart from the course. Please refer to the Marking Scheme for more details.

·         To score Distinction: you need to include at least 14 analysis examples covering data visualization, exploration, and manipulation. In addition, including at least 2 additional features which can improve the results which is apart from the course. Please refer to the Marking Scheme for more details.

·         You should include the good programming practice such as comments, variable naming conventions and indentation.

·         You are required to use R programming language.

·         In a situation where a student:


Related Questions in computer science category