Skip to contents

Cross-section data from the High School and Beyond survey conducted by the Department of Education in 1980, with a follow-up in 1986. The survey included students from approximately 1,100 high schools.

Usage

data("CollegeDistance")

Format

A data frame containing 4,739 observations on 14 variables.

gender

factor indicating gender.

ethnicity

factor indicating ethnicity (African-American, Hispanic or other).

score

base year composite test score. These are achievement tests given to high school seniors in the sample.

fcollege

factor. Is the father a college graduate?

mcollege

factor. Is the mother a college graduate?

home

factor. Does the family own their home?

urban

factor. Is the school in an urban area?

unemp

county unemployment rate in 1980.

wage

state hourly wage in manufacturing in 1980.

distance

distance from 4-year college (in 10 miles).

tuition

average state 4-year college tuition (in 1000 USD).

education

number of years of education.

income

factor. Is the family income above USD 25,000 per year?

region

factor indicating region (West or other).

Details

Rouse (1995) computed years of education by assigning 12 years to all members of the senior class. Each additional year of secondary education counted as a one year. Students with vocational degrees were assigned 13 years, AA degrees were assigned 14 years, BA degrees were assigned 16 years, those with some graduate education were assigned 17 years, and those with a graduate degree were assigned 18 years.

Stock and Watson (2007) provide separate data files for the students from Western states and the remaining students. CollegeDistance includes both data sets, subsets are easily obtained (see also examples).

Source

Online complements to Stock and Watson (2007).

References

Rouse, C.E. (1995). Democratization or Diversion? The Effect of Community Colleges on Educational Attainment. Journal of Business & Economic Statistics, 12, 217–224.

Stock, J.H. and Watson, M.W. (2007). Introduction to Econometrics, 2nd ed. Boston: Addison Wesley.

See also

Examples

## exclude students from Western states
data("CollegeDistance")
cd <- subset(CollegeDistance, region != "west")
summary(cd)
#>     gender        ethnicity        score       fcollege   mcollege    home     
#>  male  :1726   other   :2496   Min.   :28.95   no :3029   no :3267   no : 686  
#>  female:2070   afam    : 731   1st Qu.:43.87   yes: 767   yes: 529   yes:3110  
#>                hispanic: 569   Median :51.39                                   
#>                                Mean   :51.00                                   
#>                                3rd Qu.:57.96                                   
#>                                Max.   :71.36                                   
#>  urban          unemp             wage           distance         tuition      
#>  no :2870   Min.   : 1.400   Min.   : 6.590   Min.   : 0.000   Min.   :0.4342  
#>  yes: 926   1st Qu.: 5.600   1st Qu.: 8.260   1st Qu.: 0.400   1st Qu.:0.6732  
#>             Median : 7.200   Median : 9.920   Median : 1.000   Median :0.9030  
#>             Mean   : 7.655   Mean   : 9.556   Mean   : 1.725   Mean   :0.9131  
#>             3rd Qu.: 9.100   3rd Qu.:10.280   3rd Qu.: 2.500   3rd Qu.:1.1524  
#>             Max.   :24.900   Max.   :12.150   Max.   :16.000   Max.   :1.4042  
#>    education      income       region    
#>  Min.   :12.00   low :2709   other:3796  
#>  1st Qu.:12.00   high:1087   west :   0  
#>  Median :13.00                           
#>  Mean   :13.83                           
#>  3rd Qu.:16.00                           
#>  Max.   :18.00