ASSIGNMENT I
Bike sharing systems using apps have become popular in many
cities. They provide economical transportation in an environmentally friendly
manner. A number of companies are now into the bike-sharing business.
Assume you are working for Lime, a bike-sharing company, and
plan to enter a new market with bike sharing. You have collected data on bike
sharing rentals in a major U.S. city on the east coast of the U.S. for two
years (The data is actual bike-sharing rental data from one city). The data for
two years is split into two data sets, training.csv and test.csv. The fields are
as follows:
- ID: record ID
-
season : season (1:spring, 2:summer, 3:fall, 4:winter)
- mnth
: month ( 1 to 12)
- day –
day of the month ( 1 to 28 or 29 or 30 or 31)
- hr :
hour (0 to 23)
-
holiday : weather day is holiday or not
-
weekday : day of the week
-
workingday : if day is neither weekend nor holiday is 1, otherwise is 0.
+
weathersit :
- 1: Clear,
Few clouds, Partly cloudy, Partly cloudy
-
2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
-
3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain +
Scattered clouds
-
4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
- temp
: Normalized temperature in Celsius. The values are divided to 41 (max)
-
atemp: Normalized feeling temperature in Celsius. The values are divided to 50
(max)
- hum:
Normalized humidity. The values are divided to 100 (max)
-
windspeed: Normalized wind speed. The values are divided to 67 (max)
-
casual: count of casual users
-
registered: count of registered users
- cnt:
count of total rental bikes including both casual and registered
Instead of looking at data on an hourly basis, we will look
at the data on a six hour basis. You have to bin the data (put into buckets)
into the following groups.
For
each day you have to create the buckets
EM (early morning) = Hr 0,1,2,3,4,5
MN (morning
to noon) = Hr 6,7 8,9,10,11
AN (afternoon) = Hr 12,13,14,15, 16,17,
EN (evening/night) = Hr 18,19,20,21,22,23
You
will have to add the casual, registered
and cnt values for the six hours in each bucket, but you have to take the
average of temp, atemp, hum and windspeed for the six hours. For weathersit take the average and use the Round()
function to round up/down the number.
MLR Model
Develop a MLR model for predicting total
rental bikes (cnt) as a function of the independent variables using the
training data set. Use Regsubsets with Malloy’s Cp as the selection criteria to
select the best model.
1.
Copy/paste the MLR model here
Get Free Quote!
363 Experts Online