1.0
OBJECTIVES OF
THIS COURSEWORK
This
assignment will help you to explore and analyse a set of data and reconstruct
it into meaningful representations for decision making.
2.0
INDIVIDUAL ASSIGNMENT
DESCRIPTION
A DATA
ANALYSIS PROJECT USING HOURLY WEATHER DATA
This assignment needs to explore
hourly weather data set and categorize it by different techniques in such a way
that it should retrieve the necessary information which helps to do a decision
making. Your analysis should be deep and in detail, also it must go further
than what has already been covered in this course.
You have to
import the data then do the necessary pre-processing on the dataset, use the
necessary commands to convert it into the desired format. You have to apply the
data visualization, exploration, and manipulation techniques in your project.
It is very important to explain and justify the techniques that have been
chosen. Outline the findings, analyse them, and justify correctly with
appropriate graphs. Also, a supporting document is needed to reflect the graph
and code using R programming concepts. Additional features must explore further
concepts that can improve retrieval effects.
The
dataset provided for this assignment is related to the hourly meteorological
data for LaGuardia Airport (LGA) and John F. Kennedy International Airport
(JFK) in the United States. It contains 15 columns and 17,412 rows. The columns
with the description are given in the table below.
Table 1. Dataset columns description.
Column(s) |
Description |
origin |
Weather
station. |
year, month, day, hour |
Time of recording. |
temp, dewp |
Temperature
and dewpoint in F. |
humid |
Relative
humidity. |
wind_dir, wind_speed, wind_gust |
Wind
direction (in degrees), speed and gust speed (in mph). |
precip |
Precipitation,
in inches. |
pressure |
Sea level
pressure in millibars. |
visib |
Visibility in
miles. |
time_hour |
Date and hour
of the recording as a POSIXct date. |
3.0
GENERAL
REQUIREMENTS
·
The program submitted should
compile and be executed without errors.
·
Validation should be done for
each entry from the users to avoid logical errors.
·
No duplication is allowed in
dataset.
·
To score Pass: you need
to include 7 analysis examples covering data visualization, exploration,
and manipulation topics. Please refer to the Marking Scheme for more details.
·
To score Credit: you
need to include 11 analysis examples covering data visualization, exploration,
and manipulation. In addition, including at least 1 additional feature which can
improve the results which is apart from the course. Please refer to the Marking
Scheme for more details.
·
To score Distinction: you
need to include at least 14 analysis examples covering data
visualization, exploration, and manipulation. In addition, including at least 2
additional features which can improve the results which is apart from the
course. Please refer to the Marking Scheme for more details.
·
You should include the good
programming practice such as comments,
variable naming conventions and indentation.
·
You are required to use R programming language.
·
In a situation where a student:
Get Free Quote!
285 Experts Online