One strategy for handling multiple categories is to use logistic regression with a “one against the others” approach.

computer science

Description

Introduction

Please complete the following tasks regarding the data in R. Please generate a solution document in R markdown and upload the .Rmd document and a rendered .doc, .docx, or .pdf document. Please turn in your work on Canvas. Your solution document should have your answers to the questions and should display the requested plots. Also, please upload the .RData file you generate.

Each question part is worth 5 points.

Question 1: logistic regression

Q1, part 1

One strategy for handling multiple categories is to use logistic regression with a “one against the others” approach. For the data set “pew” below, please fit one model for “better today” versus the other responses to the “LIFE” question, treating “PPETHM” and “PPGENDER” as a factor variables, but the rest as quantitative. Use lrtest to compare with a model for “better today” versus the other responses to the “LIFE” question, treating “PPETHM” and “PPGENDER”,“IDEO”, and “PPEDUCAT” as a factors. Please use unweighted data.

These are nested models. The coefficients of the levels of a factor with levels 1,2,..k are fitted separately and so less restricted than the coefficients of the numeric version which must be of the form 1(c), 2(c),…k(c) for some c.

load("pew_data.RData")
pew<-dplyr::select(dat,PPINCIMP,PPGENDER,PPETHM,IDEO,PPEDUCAT,LIFE)
table(pew$PPINCIMP)

##
##   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20
##  66  31  40  91  77 104 144 183 179 167 258 321 378 285 319 486 226 253 160 125
##  21
## 131

table(pew$PPGENDER)

##
##    1    2
## 1993 2031

table(pew$PPETHM)

##
##    1    2    3    4    5
## 2862  392  166  447  157

table(pew$IDEO)

##
##   -1    1    2    3    4    5
##  116  314 1095 1624  616  259

table(pew$PPEDUCAT)

##
##    1    2    3    4
##  303 1130 1147 1444

attributes(pew$PPINCIMP)$labels

##            Not asked              REFUSED     Less than $5,000
##                   -2                   -1                    1
##     $5,000 to $7,499     $7,500 to $9,999   $10,000 to $12,499
##                    2                    3                    4
##   $12,500 to $14,999   $15,000 to $19,999   $20,000 to $24,999
##                    5                    6                    7
##   $25,000 to $29,999   $30,000 to $34,999   $35,000 to $39,999
##                    8                    9                   10
##   $40,000 to $49,999   $50,000 to $59,999   $60,000 to $74,999
##                   11                   12                   13
##   $75,000 to $84,999   $85,000 to $99,999 $100,000 to $124,999
##                   14                   15                   16
## $125,000 to $149,999 $150,000 to $174,999 $175,000 to $199,999
##                   17                   18                   19
## $200,000 to $249,999     $250,000 or more
##                   20                   21

attributes(pew$PPGENDER)$labels

## Not asked   REFUSED      Male    Female
##        -2        -1         1         2

attributes(pew$PPETHM)$labels

##              Not asked                REFUSED    White, Non-Hispanic
##                     -2                     -1                      1
##    Black, Non-Hispanic    Other, Non-Hispanic               Hispanic
##                      2                      3                      4


Related Questions in computer science category