Are You Still Uninsured?

CDC 2021 Team - Rolling

Universal health insurance coverage is important. Health Insurance provides people financial protection against high health expenses and enhances access to health services. However, in today's America, there is still a significantly high percentage of the uninsured population. In our project, we analyze the dataset of the uninsured population by state in order to better understand this issue and its related factors, hoping to give policymakers insights into improving health insurance conditions.

Model Building Process
1. Group Variables Into the Following Groups: Income, Age, Gender, Marriage Status, Child in Family, Ethnicity, SNAP Recipient, Disability, Full Time Worker in Family, Employment, Education, and Spoken Language
2. Run Linear Regression Within Groups - Select Independent Variables
3. Run Linear Regression Within Groups - Select Independent Variable
4. Run Linear Regression Between Every Potential Variable and the Uninsured Percentage as a Reference for Dropping Variables
To eliminate the variables with high linearity, we have defined our threshold to be 0.7, filtering out the ones exceeding this value. Among two colinear variables, we used the R2 value obtained from linear regression between the variable and Uninsured Percentage to determine how significant the variable is, and consequently drop the one with a lower R2.
5. After Dropping the Variables Less Related to the Uninsured Percentage, We Determine Variables to Use in the Model. Specifically, We Drop Child in Family and Full Time Worker in Family
6. Determine the Correlations Between Uninsured population and the Selected Variables
Linear Regression: Income
Linear Regression: Age
Linear Regression: Gender
Linear Regression: Married
Linear Regression: Origin
Linear Regression: SNAP Recipient
Linear Regression: Disability
Linear Regression: Employment
Linear Regression: Education
Linear Regression: Spoken Language