ITEC 320 Homework 3:
Clustering and Classification
Due: November 17
1) Using
the data in the Excel file, Sales Data,
perform K-Means Clustering on the data. Use the all the attributes as
input, except the Customer and Percent Gross Profit attributes. Review the clusters and create various plot
variable combos.
Do a detailed interpretation of your
results. Do you see any interesting
patterns?
2) You are on an analytics team at the Really Big
Financial Corporation. You market
specialized financial products geared for different income levels of potential
customers.
You do not want to
waste time and money to market the products to individuals that are not a good
fit based on their income level.
You have downloaded
and cleaned up a set of data from the US Census Department. The data includes demographic and other data
from a Census survey.
·
From the dataset “Census Data” try to predict
who makes more the $50,000.00 dollars per year and who makes less.
·
Use Training and
Testing Sets
·
The Target attribute
is “Income”: <= 50K,
>50K.
·
Try both Decision
Trees and K-NN.
·
How accurate are your
predictions?
·
Note: This is a large dataset (over
32,000 rows), so K-NN runs a little slow, it may take 30 seconds or so for K-NN
to run.
3) For the Titanic problem performed in the Lab,
now try to use Support Vector Machine (SVM) Classification. How does this compare to the accuracy of
Decision Trees and K-NN?
·
Caution:
SVM only works with numeric dependent attributes, so use the Dataset: Titanic
passengers numeric, where sex is converted to 1 – Female, 0- Male.
·
Note: SVM does not like missing
values, so you will need to use the Replace Missing Values Operator, before the
Select Attributes Operator to replace missing values for Age to the Average.
For each of the above problems,
please interpret your results and provide all supporting model output and
diagrams. (e.g. Clusters, Clusters diagrams, decision trees, performance matrix
results etc.) If you feel ambitious try
Naïve Bayes and Neural Networks as well.
Get Free Quote!
363 Experts Online