Chapter 10 – Regression Analysis: Estimating Relationships
-Dependent (or response or target) variable vs explanatory
(or independent or predictor) variable
-difference between simple and multiple regression
-Creating a scatterplot in Excel
-keep track
of what is your X and what is your Y
-What outliers are and how to deal with them
-Correlation
-What does
correlation tell us? Strength and direction of the relationship.
-Range from
-1 to 1
-Finding
correlation in Excel: =CORREL() or regression output
-Simple regression
-Method
-> Least Squares Estimation: minimizes the sum of the squared residuals.
-This
is the regression line Excel provides
-Finding
the regression line in Excel
-Formulas:
=SLOPE() and =INTERCEPT()
-Known
Y’s then Known X’s
-Coefficients
on the regression output
-Percentage
of variation explained R^2
-Is
the percentage of variation of the dependent variable explained by the
regression.
-Remember:
the coefficient for X in simple regression means that for every unit change in
X, Y increases by that amount.
-Finding
in Excel: =RSQ(), square the Correlation or look on regression output
-Multiple regression
-The
coefficients in multiple regression is the expected change in Y when this
particular X increases by one unit and all the other Xs in the equations remain
constant.
-Read much
the same way as simple regression.
-Use the
Adjusted R^2 with multiple regression.
-Dummy Variables
-A variable
that is either a 1 or a 0 that represents if the observation is in a particular
category.
-You will
need one fewer dummy variable than you have categories
-Ex.
Gender: one dummy variable where a 1 means the person is female
-Ex.
Quarterly observations: three dummy variables where a 1 means that observation
was taken during that quarter.
-The
coefficient of a dummy variable is the amount that being in a given category
adds to the outcome.
Chapter 11 –Regression Analysis: Statistical Inference
-Regression assumptions
- 1. There
is a population regression line. It joins the means of the dependent variable
for all values of the explanatory variables. For any fixed values of the
explanatory variables, the mean of the errors is zero.
- 2. For
any values of the explanatory variables, the variance (or standard deviation) of
the dependent variable is a constant, the same for all such values.
-The
fan shaped scatterplot means the data violates this assumption
- 3. For
any values of the explanatory variables, the dependent variable is normally
distributed.
- 4. The
errors are probabilistically independent.
-Check
using the runs test
-Regression coefficients
-t-test and
p-value
-all
coefficients have two values associated with them, a t-stat and a p-value
-the
null hypothesis for the t-test is “The coefficient for this variable is 0” -
bad
-the
alternative hypothesis for the t-test is “The coefficient for this variable is
not 0” – good
-the
p-value is the probability that the null hypothesis is true; that the actual
coefficient is 0.
-if
the p-value is small, below .05, we can reject the null hypothesis and adopt
the alternative
Get Free Quote!
350 Experts Online