Linear Regression Analysis of US Census Data

Linear Regression Analysis of US Census Data
Average: 4.2 (5 votes)
Number of Collaborators
Project Status
Project Type
Project File
Project Description


The goal is to predict a person's income based on their age, gender, class of worker, and education level. We validated a model using numerical representation, in particular R squared and RMSE. PINCP, AGEP, SEX, SCHL data are selected within the PUMS dataset. The beta coefficient for age is 0.012 which means that for every addition increase in age we estimate that the salary increases by a factor of 10^0.012 = 1.028. The beta coefficient for bachelor’s degree is 0.39 which means that people with bachelor’s degree make 10^0.39 = 2.45 times as much as those who did not completed high school. 


Public Use Microdata Sample Data

Data File
Data Description

Dataset PUMS

PUMS stands for Public Use Microdata Sample by American Community Survey (ACS), an ongoing survey that provides information on a yearly basis about the United States and its citizens. It contains detailed population and housing information such as Class of Worker, Education Level, Gender, and Income.