Course Description
This course will introduce students to an applied, intermediate level of quantitative and econometric analysis focused on practical applications that are relevant in fields such as economics, finance, public policy, business, and marketing. This course will focus on applied regression analysis and is intended to give students hands-on experience with real data and real analysis. The course will help you become both a sophisticated consumer of relatively advanced statistical techniques and a better practitioner in conducting your own empirical analyses. By learning econometric methods and applications, students will develop the capacity to build the kind of predictive models that enhance decision making when faced with uncertainty in real world contexts. These tools and skills will also enable students to perform analyses that, under some circumstances, allow us to make valid causal inferences about the effect of a program or intervention on an outcome of interest.
The course begins with a recap of simple and multiple linear regression, and then moves to techniques for analyzing real-world quantitative data: incorporating variables in regression analysis that are categorical as well as quantitative, and considering the interactions between independent variables. We will consider model specification in practice—how to choose our independent variables, and how to model the correct functional form. Students will learn how to model nonlinear functional relationships using OLS through transformations of the data. We consider important assumptions that must be fulfilled in order that we obtain credible estimates of our predictors of interest, how to diagnose departures from these assumptions, and practical correction strategies. We follow this with select topics of special interest including modeling binary dependent variables, and the analysis of pooled-cross sectional and panel data.
Lectures will include examples in STATA format, a widely used statistical package in the social sciences and business programs. All course exercises, however, will be designed and presented in both STATA and R. Each lesson will include an instructional component and an exercise to give you an opportunity to apply the methods and techniques using actual data. Basic instruction (i.e., sample syntax) will be provided in both STATA and R for all exercises.
What am I going to get from this course?
Answer questions such as these:
- What is the impact of non-traditional factors in predicting credit worthiness?
- What is the effect of a country's resource abundance in promoting economic growth?
- What are the key financial determinants of loan application approval, all else being equal?
- What is the impact of air pollution levels on median neighborhood housing prices?
- What is the estimated gender-wage gap, all else being equal (and how does the wage gap vary by level of education)?
Curriculum
Module 1: The Fundamentals of Applied Regression Analysis
01:13:20
Lecture 1
Lesson 1: Linear Regression Recap
16:46
Understand why randomized experiments are the gold standard in estimating credible estimates of a treatment effect. Understand the importance of controlling for confounding characteristics in estimating a “treatment effect” using multiple linear regression when analyzing observational data.
Quiz 1
Lesson 1 Exercise & Answer Key
Lecture 2
Lesson 2: Multiple Regression in Practice (Part 1)
19:58
Understand and be able to interpret coefficient estimates on binary and quantitative explanatory variables. Understand and be able to interpret coefficient estimates on an interaction between binary and quantitative explanatory variables.
Lecture 3
Lesson 2: Multiple Regression in Practice (Part 2)
14:51
Understand and be able to interpret coefficient estimates on categorical explanatory variables with more than two categories.
Quiz 2
Lesson 2 Exercise and Answer Key
Lecture 4
Lesson 2: Multiple Regression in Practice (Part 3)
21:45
Understand and be able to interpret coefficient estimates on an interaction between two quantitative explanatory variables.
Module 2: Model Specification in Theory and Practice
02:31:29
Lecture 5
Lesson 3: Model Specification I (Part 1)
12:41
Understand the concept of the population regression function and the key properties of OLS slope estimators when the Classical Assumptions are fulfilled.
Lecture 6
Lesson 3: Model Specification I (Part 2)
14:52
Understand the meaning and consequences of omitted variable bias. Be able to anticipate the nature of the bias associated with omitted variable(s) in linear regression models.
Lecture 7
Lesson 3: Model Specification I (Part 3)
27:32
Understand how careful selection of the right explanatory variables can reduce the bias associated with coefficient estimate on particular explanatory variables of interest. Understand best practices in choosing between alternative model specifications.
Lecture 8
Lesson 3: Model Specification I (Part 4)
05:21
Understand the meaning and practical consequences of including an irrelevant explanatory variable.
Lecture 9
Lesson 4: Model Specification II (Part 1)
11:14
Understand when a linear functional form is appropriate in regression analysis. Be able to detect nonlinear functional forms using graphical and numerical summaries.
Lecture 10
Lesson 4: Model Specification II (Part 2)
22:04
Understand and be able to implement natural logarithmic transformations of quantitative variables. Understand and be able to interpret coefficient estimates from linear-log models, and when a linear-log model is appropriate.
Lecture 11
Lesson 4: Model Specification II (Part 3A)
11:14
Understand and be able to interpret coefficient estimates from log-linear models, and when a log-linear model is appropriate.
Lecture 12
Lesson 4: Model Specification II (Part 3B)
10:55
Case study to understand how to interpret coefficient estimates on interactions in log-linear models.
Lecture 13
Lesson 4: Model Specification II (Part 4)
23:07
Understand and be able to interpret coefficient estimates from log-log models, and when a log-log models is appropriate.
Lecture 14
Lesson 4: Model Specification II (Part 5)
12:29
Understand and be able to interpret coefficient estimates from polynomial models, and when a polynomial models is appropriate.
Quiz 4
Lesson 4 Exercises
Module 3: Classical Assumptions: Detection and Correction
02:13:14
Lecture 15
Lesson 5: Multicollinearity (Part 1)
14:15
Understand the meaning and practical consequences of multicollinearity.
Lecture 16
Lesson 5: Multicollinearity (Part 2)
18:10
Understand how to detect severe multicollinearity in multiple regression models and possible prescription strategies in the presence of severe multicollinearity.
Lecture 17
Lesson 6: Heteroskedasticity (Part 1)
16:37
Understand the meaning and practical consequences of heteroskedasticity.
Lecture 18
Lesson 6: Heteroskedasticity (Part 2)
19:31
Understand how to detect heteroskedasticity in multiple regression models using graphical methods and formal tests.
Lecture 19
Lesson 6: Heteroskedasticity (Part 3)
08:18
Understand and be able to estimate and interpret robust standard errors as a correction strategy in the presence of heteroskedasticity.
Lecture 20
Lesson 7: Serial Correlation (Part 1)
18:38
Understand the meaning and practical consequences of first-order serial correlation in time-series data.
Lecture 21
Lesson 7: Serial Correlation (Part 2)
15:47
Understand how to detect first-order serial correlation using a formal test.
Lecture 22
Lecture 7: Serial Correlation (Part 3)
16:28
Understand and be able to estimate and interpret estimated GLS as a correction strategy in the presence of first-order serial correlation.
Lecture 23
Lesson 7: Serial Correlation (Part 4)
05:30
Understand and be able to estimate and interpret robust standard errors as a correction strategy in the presence of first-order serial correlation.
Module 4: Practical Applications I: Binary Choice Models
01:35:17
Lecture 24
Lesson 8: Binary Dependent Variable Models (Part 1)
27:13
Understand and be able to estimate and interpret a linear probability model (LPM) when modeling a binary dependent variable. Understand the strengths and limitations of LPM’s.
Lecture 25
Lesson 8: Binary Dependent Variable Models (Part 2)
29:05
Understand and be able to estimate and interpret binary logistic regression model, and the advantages to the logit model over the LPM. Understand and be able to interpret logit coefficient estimates as odds ratios. Understand and be able to estimate and interpret predicted probability changes of a successful outcome (holding all else at specific values).
Lecture 26
Lesson 8: Binary Dependent Variable Models (Part 3)
15:18
Understand and be able to compare and contrast the results of LMP’s and logit models.
Lecture 27
Lesson 8: Binary Dependent Variable Models (Part 4)
23:41
Understand best practices in choosing a model specification in binary response models and how to interpret resulting output.
Quiz 8
Lesson 8 Exercises
Module 5: Practical Applications II: Analyzing Pooled Cross-Sectional and Panel Data
01:52:46
Lecture 28
Lesson 9: Pooled Cross-Sectional Data (Part 1)
18:07
Understand and be able to estimate and interpret coefficient estimates from models using pooled cross-sectional data with two or more time periods.
Lecture 29
Lesson 9: Pooled Cross-Sectional Data (Part 2)
36:21
Understand and be able to estimate and interpret coefficient estimates from Difference-in- Differences (DID) models that use two periods of pooled cross-sectional data. Understand the key assumptions associated with DID models, and when this analysis strategy is appropriate.
Lecture 30
Lesson 10: Panel Data Analysis (Part 1)
13:35
Understand the problem of unobservable heterogeneity and the limitations of pooled OLS as a strategy for analyzing panel data.
Lecture 31
Lesson 10: Panel Data Analysis (Part 2)
13:29
Understand and be able to estimate and interpret first-difference models when you have two periods of panel data. Understand how first-differences can overcome the problems associated with using pooled OLS to analyze panel data (if certain assumptions
hold).
Lecture 32
Lesson 10: Panel Data Analysis (Part 3)
19:24
Understand and be able to estimate and interpret deviation from means (fixed effects) and least squares dummy variable models when you have panel data. Understand the concept of unit and time fixed effects and how bias may be reduced when estimating fixed effects models (if certain assumptions hold).
Lecture 33
Lesson 10: Panel Data Analysis (Part 4)
11:50
Understand and be able to interpret the results from a case study on the impact of seat belt usage on traffic fatalities at the state level using a fixed effects approach.
Quiz 10
Lesson 10 Exercise