This assignment assesses your skills for data exploration
and classification of a simple dataset using Matlab.
You will choose a dataset, describe its characteristics with
the help of diagrams, and build a classification model.
You will need to
submit the written part (report from tasks 2 and 4) and the practical part
(models from Task 3 as .mat files)
Task 1
Choose one dataset
from the UCI Machine Learning repository (link) and check its suitability via
email with your TA. After successful confirmation by your TA, familiarise
yourself with it. [5 marks]
Task 2
Describe your chosen
dataset (400-500 words). [25 marks] What is the data about? How many features
(attributes), instances, and classes does it have and what data types are
these? What are the maximum, minimum, and average values of the continuous
numerical features? Using Matlab’s plotting functions, illustrate the features
of your dataset using meaningful boxplots, histograms and grouped scatter plots
(remember, these plots allow you to analyse the individual distribution of
features, as well as the relationship between them). Explain what you can learn
about the dataset from the diagrams.
Task 3
Using Matlab, build three different classification models
for your dataset and evaluate them. [40 marks] For this task, you will build a
Decision Tree, a Naïve Bayes model, and a k-Nearest-Neighbour model using the
relevant Matlab functions. Use a 60-40 percent split to train and test the
performance of the models. Also save your them as .mat files and submit them
through Blackboard. Use the diary function to save all Matlab commands that you
used for this task and attach this to the end of your report.
Task 4
Describe and analyse your classification results (300-400
words). [25 marks] Which models performed better, which ones performed worse,
and explain why? In order to answer these questions, you need to evaluate the
performance of your models. You are required to compute the corresponding
confusion matrices and their associated recall, precision and accuracy metrics
(refer to the lecture slides and see here for more info). There will be 5 marks
for the presentation of the assignment including spelling and grammar, layout
and formatting, and readability of figures. Good luck!
Task5
Write the mistakes you have encountered in doing this
assignment ( to show the student has done it instead of him getting helped by
experts)
No use of citations or referencing, the work must be completely
done by the student
Get Free Quote!
427 Experts Online