2018-10-05

Multiple Regression

  • Instead of one regressor, we can have multiple ones.
  • For example, how do horse power (hp) and weight (wt) relate to miles per gallon? \[ mpg_i = b_0 + b_1 hp_i + b_2 wt_i + e_i \]

  • Well, how?

Multiple Regression Visualized

Multiple Regression: \(K\) regressors

  • Multiple means there can be more than 2 regressors as well.
  • In fact, there could be \(K\) of them. \[ \begin{aligned} \widehat{y}_i &= b_0 + b_1 x_{1i} + b_2 x_{2i} + \dots + b_K x_{Ki}\\ e_i &= y_i - \widehat{y}_i \end{aligned} \]
  • The partial effect of regressor \(j\) is the partial derivative \[ \frac{\partial y}{\partial x_j} = b_j \]

Ceteris Paribus

All Else Equal (Ceteris Paribus)

  • How does \(mpg\) change if we change \(hp\)?
  • This question only makes sense if \(wt\) does not change.
  • Holding the others constant: partial derivative. \[ \frac{\partial mpg_i}{\partial hp_i} = b_1 \]
  • Do this!

    lm(formula = mpg ~ wt + hp, data = mtcars)

Multicolinearity

  • Multiple regression requires that variables are linearly independent.
  • Imagine we had wt_plus defined as wt + 1 above.
  • wt_plus wouldn’t add any information - hence it’s redundant
  • If we only have one x, this translates into \(Var(x)\neq0\).
  • Gives rise to the rank condition, i.e. that \(N\geq K+1\)

California Test Scores 2

  • Let’s go back to the Caschool dataset.
  • Could testscr also depend on average income in the school area? \[ \text{testscr}_i = b_0 + b_1 \text{str}_i + b_2 \text{avginc}_i + e_i \]
  • We simply add avginc to our formula:

    library("Ecdat") # reload the data
    fit_multivariate <- lm(formula = "testscr ~ str + avginc", data = Caschool)
    summary(fit_multivariate)

Visualizing California Test Scores 2

  • Let’s again look at the plot

Interactions

Interactions

  • We can relax the ceteris paribus assumption.
  • If we want to allow the effect of one variable to depend on the value of another one, we can do this.
  • Is the effect of str dependent on avginc? Is the effect stronger in richer areas?
  • Go back to the book

Tutorial - CPS1988

Tutorial Time!

library(ScPoEconometrics)
runTutorial("lm-example")