[Get it solved] The primary objective is to use classification techniques...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

The primary objective is to use classification techniques learnt so far. Each loan is graded (A to G) based on the risk, with A being least risky and G being the highest risk category.

data mining

Description

We will revisit the Lending Club data for this week’s assignment. The company has existed since 2007 and have provided millions of personal loans since then. Lending Club announced IPO in December 2014, since when the company came in the limelight for negative publicity. Lending club officials were accused of taking aggressive risks by lending money to those with risky credit worthiness. You are asked to study this phenomenon and determine if data provides clues of the authenticity of the claim that Lending Club behaved irresponsibly.

You are given a single combined file of “approved” loans data from six years, which are supposedly the pre and post periods of the controversy.

Step 1 (30 Points)

The first step is create two new columns as follows:

a) Comb_Risk_One: Create a binary column by combining categories A and B (Low Risk) into one category and all the remaining categories in another (High Risk).

b) Comb_Risk_Two: Create a binary column by combining categories A, B and C (Low Risk) into one category and all the remaining categories in another (High Risk).

Now, break the file into two files filtering out data for 2012, 13, and 14 in one file and 2015, 16 and 17 in another file.

Step 2 (70 Points)

The primary objective is to use classification techniques learnt so far. Each loan is graded (A to G) based on the risk, with A being least risky and G being the highest risk category. You are asked to predict Low and High-risk categories (for the two new response variables) using various modeling techniques like Naïve Bayes’, KNN, Logistic Regression, and CART model. Make sure to look for the following:

a. Outliers based on the independent columns (predictors)

b. Multicollinearity

c. Scaling and standardization of the predictors

d. Train-Test split for both files and compare the confusion matrices on the Test.

Produce a “well documented and explained” R Markdown knit file analyzing the data with findings on the model with the highest classification ability. Also describe the features of the categories that are not classified correctly. Create a confusion matrix to answer the last question and run descriptive statistics on the misclassified categories. Provide any necessary EDA and visuals to enhance understanding of your analysis.

Related Questions in data mining category

Performance Lawn Equipment Database.xlsx. Requirements: In your spreadsheet file (which you will submit after finishing all the analyses), copy the 2014 Customer Survey, On-Time Delivery, Defects After Delivery, and Employee Retention worksheets from the

For years researchers and data analysts have debated on which type of database management system will best suit their needs.

Managing Web & Database Technology Number 1 TERM PROJECT (10 pages, double-spaced, both presentation and write up)

What is the difference between database types and capacities? How do data inaccuracies affect patient care and reimbursement?

prepare a Data Flow Diagram and flowchart of the following scenario. (Your DFD should go to a Level !.) WORKFLOW FOR A PRIMARY CARE CLINIC WITH A PAPER MEDICAL RECORD The typical workflow for a patient visit at this primary care clinic begins with the pat

You are working in a team of business analysts at a large company in the present day. The “AdaBoost” algorithm was only published a few weeks ago and has not been widely implemented yet.

The first part of the submission will be the URL to the dashboard on Tableau Public

Now that you have analyzed your data and received feedback, it is time to write a current APA style Results section describing the outcome of your study

To complete this assignment, please provide a detailed written summary of your analysis of the US Congress network dataset. The data are

The US organization is interested in using business intelligence solutions to help with strategic decision making and has asked you to demonstrate how BI tools can analyze data selected from a public dataset.

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

Get Free Quote!

435 Experts Online

Get Instant Help with your Questions &
boost your grades

you can count us with it
Highly Satisfied Students 4.9/5
Based On 19835+ Reviews

Get Help Now

We Provide Services Across The Globe

Disclaimer: The reference papers or solutions provided by Calltutors.com serve as model papers or solutions for students or professionals and are not to be submitted as it is to any institutions. These documents are intended to be used for research and reference purposes only. University and company's logo's are the property of respected owners. We don't have affiliation with the mentioned universities. By using our services means, you agree to our Honor Code , Privacy Policy , Terms & Conditions , Payment , Refund & Cancellation Policy.

Enroll in the complete course for only $250 USD*

The primary objective is to use classification techniques learnt so far. Each loan is graded (A to G) based on the risk, with A being least risky and G being the highest risk category.

data mining

Description

Get instant assignment help service

Related Questions in data mining category

Policy

Exploring

Other

Connect With Us

Get Instant Help with your Questions &
boost your grades

you can count us with it
Highly Satisfied Students 4.9/5
Based On 19835+ Reviews

We Provide Services Across The Globe

Enroll in the complete course for only $250 USD*

The primary objective is to use classification techniques learnt so far. Each loan is graded (A to G) based on the risk, with A being least risky and G being the highest risk category.

data mining

Description

Get instant assignment help service

Related Questions in data mining category

Policy

Exploring

Other

Connect With Us

Get Instant Help with your Questions & boost your grades

you can count us with it Highly Satisfied Students 4.9/5 Based On 19835+ Reviews

We Provide Services Across The Globe

Get Instant Help with your Questions &
boost your grades

you can count us with it
Highly Satisfied Students 4.9/5
Based On 19835+ Reviews