Homework
4
BIST
0551J: Applied Regression
Analysis for Public Health Studies
October
27th 2020
This is a real data set which was
used in “Leo Breiman and Jerome H. Friedman (1985), Estimating optimal transformations
for multiple regression and correlation, JASA, 80, pp. 580-598". The
problem is to predict the daily maximum one-hour-average ozone reading in Los
Angeles. The original data set can be found in R under mlbench package with
the name “Ozone”. There are 12 predictor variables in the original data set. I
have removed the categorical variables and the entries with missing information
which leaves you with 9 predictor variables and 203 observations.
Y
Daily maximum one-hour-average
ozone reading X1
500 millibar pressure height (m)
measured at Vandenberg AFB X2
Wind speed (mph) at Los Angeles
International Airport (LAX) X3
Humidity (%) at LAX X4
Temperature (degrees F) measured at
Sandburg, CA X5
Temperature (degrees F) measured at
El Monte, CA X6
Inversion base height (feet) at LAX X7
Pressure gradient (mm Hg) from LAX
to Daggett, CA X8
Inversion base temperature (degrees
F) at LAX X9
Visibility (miles) measured at LAX |
You are to come up with the best
model that predicts the daily maximum one-hour-average ozone reading in Los
Angeles using any combination of tools you have learned in this class so far.
It is natural that people may come up with almost equally useful different
models. You will be assessed on based on not an ultimate truth but rather how
you approach the problem, how correctly you implement the covered methods and
how you justify your actions.
The projects must be e-mailed to cceohh@gmail.com
on October 28h. You may not give or receive any aid on this
take-home examination. Good luck!
Get Free Quote!
270 Experts Online