You will be using a retail store transaction dataset of 5000 transactions for this part. Execute the following commands to read it in a format digestible to the algorithm Set working directory.

data mining

Description

Home Work 3 – Part B

You will be using a retail store transaction dataset of 5000 transactions for this part. Execute the following commands to read it in a format digestible to the algorithm

Set working directory. 

trans_mat<-read.csv("5000-out2.csv",header=TRUE,sep=",")


#convert it into a data matrix

a_matrix<-data.matrix(trans_mat)


#remove the transaction_ID column i.e., column 1

a_matrix<-a_matrix[,-1]


#use logical function to convert binary to T/F

basket2<-apply(a_matrix,2,as.logical)


#Now coerce it into a matrix

basket<-as(basket2,"transactions")


Use “basket” data with support=0.03, confidence=0.7 and minlen = 2 to extract association rules and save the results to “grules”. How many rules do you have? Which items are most likely to be bought together? (1 Point)

What if you change the minimum support to 0.04? Save the results as “grules2” (1 Point)

Draw a scatterplot with data “basket”. Set support = 0.03 (1 Point)

Draw a scatterplot with data “basket” to show top 20 items (1 Point)

Show the first three rules from “grules”

Sort “grules” by lift and show top 5 association rules

Find subsets of rules containing any “Hot.Coffee” items and save the results to “coffee”

Show top 10 rules in “coffee” subset 


Instruction Files

Related Questions in data mining category