Chapter 8 Projects

This chapter contains several empirical projects.

8.1 Opportunity Atlas

In this empirical project you will have a close look at what is commonly known as the American Dream - the idea that in a society with few barriers, everybody can achieve upward social mobility and be better off than their parents, if only they work hard enough.13

The Opportunity Atlas is a website maintained by the US Census Bureau and fed with data from recent high-quality research on upward social mobility at the Census Tract level. Census Tracts are geographic areas which contain on average less than 4000 residents, and which cover the entire United States. The Atlas has been widely used in recent newspaper reporting, and you should start this project by reading the corresponding piece from the New York Times Upshot series.

This project focuses on developing your descriptive data analysis skills. In particular you will

  1. use mapping tools to visualize geospatial data, and
  2. compute simple descriptive statistics and report regression results to shed light on the relationship between two or more random variables.

In order to achieve those goals, you will interact with the Atlas website at https://www.opportunityatlas.org, and you will use the raw data behind the website to compute additional statistics.

8.1.1 Instructions

  1. Your submission will be a statistical report addressing the below questions, produced from a single R markdown file (ending in .Rmd). You can create a simple Rmd template in Rstudio by clicking top left on the “new file” symbol, then selecting R Markdown. A quick but effective guide on using Markdown is available here.
  2. Please submit two files: the .Rmd source file (which creates the report), and the report itself, in either HTML or PDF format (Rstudio dropdown menu knit)
  3. You can do this exercise in teams of up to 3 people: make sure to edit the author field in the Rmd correspondingly.
  4. Submit by sending a private message with the Rmd as attachment on slack.

8.1.2 Questions

  1. Referring to the NYT article:

    1. Why does the Seattle Housing Authority give away vouchers to pay for rent in the area between 100th and 115th Streets, east of Meridian, west of 35th Avenue?

    2. The data here shows how easy it is for children of poor parents to escape poverty themselves. In explaining what makes a good neighborhood, how much of the variation we see is actually explained by things like school boundary lines and poverty levels alone? In other words, how important is the school you go to in isoluation of other factors, if you want to escape the low rank of your parents in the income distribution?

    3. Some select census tracts have received up to 500 million USD since 1990 in place-based neighborhood improvement policies. Do we know if and to what extent those investments were efficacious?

  2. Go to https://www.opportunityatlas.org and look at the census tracts around where your home is, if you grew up in the United States. If you grew up outside the US, randomly choose one of the State Capitals from this list and select any census tract at random by zooming in. Make sure to immediately post your choice of city in the corresponding slack channel (e.g. “florian-atlas” for my group) to avoid having multiple reports on the same city. Let’s call the chosen census tract your home.

    1. Create Figure 1 in your report, which should display the map shown by https://www.opportunityatlas.org for the census tracts around your home (you can download that map as an image bottom left). Notice that you can include a figure in an .Rmd like follows. Remember that all R-code chunks in your .Rmd are inside a block delimited by tripple backticks ```.

      ```{r your-chunk-label,echo = FALSE, out.width = "75%"}
      # your-chunk-label: name you give that code chunk (optional)
      # echo = TRUE/FALSE: display the code in output?
      # out.width = "75%": scale image to 75% of page width
      # all options: https://yihui.name/knitr/options/
      knitr::include_graphics("path_to_your_grahpic.png") 
      ```
    2. Your accompanying text should describe what is shown in that map, in particular what data are being visualized. Examine next the patterns for a number of different groups (e.g., lowest income children, high income children) and outcomes (e.g., earnings in adulthood, incarceration rates). Only choose one or two of these to include in your report.

  3. To answer the next question, read the Opportunity Atlas Manual: What period do the data you are analyzing come from? Are you concerned that the neighborhoods you are studying may have changed for kids now growing up there? What evidence do the authors of the manual provide suggesting that such changes are or are not important? What type of data could you use to test whether your neighborhood has changed in recent years?

  4. Now turn to the atlas.Rds data set, which you can load as shown below. How does average upward mobility, pooling races and genders, for children with parents at the 25th percentile (kfr_pooled_p25) in your home Census tract compare to mean (population-weighted, using count_pooled) upward mobility in your state and in the U.S. overall? Do kids where you grew up have better or worse chances of climbing the income ladder than the average child in America? Hint: The Opportunity Atlas website will give you the tract, county, and state FIPS codes for your home address. For example, searching for “Lynwood Road, Verona, New Jersey” will display Tract 34013021000, Verona, NJ. The first two digits refer to the state code, the next three digits refer to the county code, and the last 6 digits refer to the tract code. In R, listing this observation can be done as follows:

    # load data
    d = readRDS(system.file(package = "ScPoEconometrics","datasets","atlas.Rds"))
    # subset data.frame
    subset(d, select = kfr_pooled_p25, subset = state == 24 & county == 003 & tract == 706500)
  5. What is the standard deviation of upward mobility (population-weighted) in your home county, and what does this number tell you? Is it larger or smaller than the standard deviation across tracts in your state (i.e. compare your county to all other counties in your state)? Across tracts in the entire country? What do you learn from these comparisons? Notice that you can compute a weighted standard deviation for vector x using weights weight_variable in R like this

    # you need the Hmisc package installed. if not:
    install.packages("Hmisc")
    sqrt(Hmisc::wtd.var(x, weights = weight_variable))
  6. Now let’s turn to downward mobility: repeat questions 3. and 4. looking at children who start with parents at the 75th and 100th percentiles. How do the patterns differ?

  7. Using a linear regression, estimate the relationship between outcomes of children at the 25th and 75th percentile for the Census tracts in your home county. Generate a scatter plot to visualize this regression. Do areas where children from low-income families do well generally have better outcomes for those from high-income families, too?

    # explain lm
    # explain scatter plot
  8. Next, examine whether the patterns you have looked at above are similar by race. If there is not enough racial heterogeneity in the area of interest (i.e., data is missing for most racial groups), then choose a different area to examine.

  9. Using the Census tracts in your home county, can you identify any covariates which help explain some of the patterns you have identified above? Some examples of covariates you might examine include housing prices, income inequality, fraction of children with single parents, job density, etc. For 2 or 3 of these, report estimated correlation coefficients along with their 95% confidence intervals.

  10. Open question: formulate a hypothesis for why you see the variation in upward mobility for children who grew up in the Census tracts near your home and provide correlational evidence testing that hypothesis. For this question, many covariates have been provided to you in the atlas.Rds file, which are described in subsection 8.1.3.

8.1.3 Detailed Data Description

The data consist of \(n = 73,278\) U.S. Census tracts. For more details on the construction of the variables included in this data set, please see Chetty, Raj, John Friedman, Nathaniel Hendren, Maggie R. Jones, and Sonya R. Porter. 2018. “The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility.” NBER Working Paper No. 25147

Variable name Label Obs.
tract Tract FIPS Code (6-digit) 2010 73,278
county County FIPS Code (3-digit) 73,278
state State FIPS Code (2-digit) 73,278
cz Commuting Zone Identifier (1990 Definition) 72,473
hhinc_mean2000 Mean Household Income 2000 72,302
mean_commutetime2000 Average Commute Time of Working Adults in 2000 72,313
frac_coll_plus2010 Fraction of Residents with a College Degree or More in 2010 72,993
foreign_share2010 Share of Population Born Outside the U.S. 72,279
med_hhinc2016 Median Household Income in 2016 72,763
med_hhinc1990 Median Household Income in 1999 72,313
popdensity2000 Population Density (per square mile) in 2000 72,469
poor_share2010 Poverty Rate 2010 72,933
poor_share2000 Poverty Rate 2000 72,315
poor_share1990 Poverty Rate 1990 72,323
share_black2010 Share black 2010 73,111
share_hisp2010 Share Hispanic 2010 73,111
share_asian2010 Share Asian 2010 71,945
share_black2000 Share black 2000 72,368
share_white2000 Share white 2000 72,368
share_hisp2000 Share Hispanic 2000 72,368
share_asian2000 Share Asian 2000 71,050
gsmn_math_g3_2013 Average School District Level Standardized Test Scores in 3rd Grade in 2013 72,090
rent_twobed2015 Average Rent for Two-Bedroom Apartment in 2015 56,607
singleparent_share2010 Share of Single-Headed Households with Children 2010 72,564
singleparent_share1990 Share of Single-Headed Households with Children 1990 72,196
singleparent_share2000 Share of Single-Headed Households with Children 2000 72,285
traveltime15_2010 Share of Working Adults w/ Commute Time of 15 Minutes Or Less in 2010 72,939
emp2000 Employment Rate 2000 72,344
mail_return_rate2010 Census Form Rate Return Rate 2010 72,547
ln_wage_growth_hs_grad Log wage growth for HS Grad., 2005-2014 51,635
jobs_total_5mi_2015 Number of Primary Jobs within 5 Miles in 2015 72,311
jobs_highpay_5mi_2015 Number of High-Paying (>USD40,000 annually) Jobs within 5 Miles in 2015 72,311
nonwhite_share2010 Share of People who are not white 2010 73,111
popdensity2010 Population Density (per square mile) in 2010 73,194
ann_avg_job_growth_2004_2013 Average Annual Job Growth Rate 2004-2013 70,664
job_density_2013 Job Density (in square miles) in 2013 72,463
kfr_pooled_p25 Household income ($) at age 31-37 for children with parents at the 25th percentile of the national income distribution 72,011
kfr_pooled_p75 Household income ($) at age 31-37 for children with parents at the 75th percentile of the national income distribution 72,012
kfr_pooled_p100 Household income ($) at age 31-37 for children with parents at the 100th percentile of the national income distribution 71,968
kfr_natam_p25 Household income ($) at age 31-37 for Native American children with parents at the 25th percentile of the national income distribution 1,733
kfr_natam_p75 Household income ($) at age 31-37 for Native American children with parents at the 75th percentile of the national income distribution 1,728
kfr_natam_p100 Household income ($) at age 31-37 for Native American children with parents at the 100th percentile of the national income distribution 1,594
kfr_asian_p25 Household income ($) at age 31-37 for Asian children with parents at the 25th percentile of the national income distribution 15,434
kfr_asian_p75 Household income ($) at age 31-37 for Asian children with parents at the 75th percentile of the national income distribution 15,360
kfr_asian_p100 Household income ($) at age 31-37 for Asian children with parents at the 100th percentile of the national income distribution 13,480
kfr_black_p25 Household income ($) at age 31-37 for Black children with parents at the 25th percentile of the national income distribution 34,086
kfr_black_p75 Household income ($) at age 31-37 for Black children with parents at the 75th percentile of the national income distribution 34,049
kfr_black_p100 Household income ($) at age 31-37 for Black children with parents at the 100th percentile of the national income distribution 32,536
kfr_hisp_p25 Household income ($) at age 31-37 for Hispanic children with parents at the 25th percentile of the national income distribution 37,611
kfr_hisp_p75 Household income ($) at age 31-37 for Hispanic children with parents at the 75th percentile of the national income distribution 37,579
kfr_hisp_p100 Household income ($) at age 31-37 for Hispanic children with parents at the 100th percentile of the national income distribution 35,987
kfr_white_p25 Household income ($) at age 31-37 for white children with parents at the 25th percentile of the national income distribution 67,978
kfr_white_p75 Household income ($) at age 31-37 for white children with parents at the 75th percentile of the national income distribution 67,968
kfr_white_p100 Household income ($) at age 31-37 for white children with parents at the 100th percentile of the national income distribution 67,627
count_pooled Count of all children 72,451
count_white Count of White children 72,451
count_black Count of Black children 72,451
count_asian Count of Asian children 72,451
count_hisp Count of Hispanic children 72,451
count_natam Count of Native American children 72,451

8.2 Trade Exercise