Data exploration: what is the format of the data? How many individual accounts are there in this dataset? What are the top three (3) product/services that the bank customers have bought?

data mining

Description

Question 1

 Imagine you were the manager of the Bank ABC. You would like to find out what financial products/services are more likely to be bought together by your customers, so that you and your team can better design recommendations and advertising campaigns for your financial services in the coming year. After consulting with a senior business analytics specialist, you decide to conduct a market basket analysis.

 

The IT department has done the data preparation as per your requirement. A BANK data set that contains service information of thousands of customers is ready for analysis. There are three variables in the dataset, as shown in the table below.

 

Field

Type

Description

ACCOUNT

Nominal

Account number

SERVICE

Nominal

Type of product service

VISIT

Ordinal

Order of product purchase

 

The 13 products are represented in the data set using the following abbreviations:

ATM   automated teller machine debit card

AUTO         automobile installment loan

CCRD          credit card

CD               certificate of deposit

CKCRD        check/debit card

CKING         checking account

HMEQLC     home equity line of credit

IRA               individual retirement account

MMDA        money market deposit account

MTG            mortgage

PLOAN        personal/consumer installment loan

SVG             saving account

TRUST         personal trust account

 

 (a) Data exploration: what is the format of the data? How many individual accounts are there in this dataset? What are the top three (3) product/services that the bank customers have bought? Show how you get the results. Evaluate the suitability of using Association Rule Mining for this problem. (15 marks)

 (b) Construct an Apriori model on the dataset using IBM SPSS Modeler. The model details and interpretation of results should include the following:

 (i) Report the “Fields” setting of the Apriori node. Give a screenshot of the setting. (ii) Set the Minimum Support = 10%, Minimum Confidence = 60%, Maximum number of antecedents = 5.  Report the number of rules generated, and give a screenshot of the rules. (iii) Observe the top 10 rules that have the highest Confidence values; what is the key pattern you see when interpreting these rules in general (no need to list out all the 10 rules)? (iv) Report two (2) other interesting rules and their implications. (25 marks)

 (c) Distinguish Sequence Pattern Mining from ARM. Evaluate whether Sequence Pattern Mining can be used to study this dataset.    (10 marks)

 

 

Question 2

 One common issue with most traditional Association Rule Mining (ARM) algorithms (e.g., Apriori, CARMA) is their inability to mine numerical data without first converting them into categorical ones. Write a research essay to discuss this issue and critically review at least one important research article that attempts to address this issue. The review should include the technique description, advantage and possible limitation discussion of the proposed method in the research article. Keep the essay length to two pages. (40 marks) 


Related Questions in data mining category


Disclaimer
The ready solutions purchased from Library are already used solutions. Please do not submit them directly as it may lead to plagiarism. Once paid, the solution file download link will be sent to your provided email. Please either use them for learning purpose or re-write them in your own language. In case if you haven't get the email, do let us know via chat support.