-
Create a scatter plot of U.S. population vs. year using the U.S. census data
Population, Housing Units, Area Measurements, and Density: 1790-1990 that you previously assembled. Add a
linear trend line (line of best fit) and forecast the line for 30 years ahead. Note the following:
- Coefficient of Determination (r2)
- Linear regression equation
- Population predicted for 2000
- Population predicted for 2010
- Population predicted for 2020
-
Create another similar scatter plot using the same data. Add an
exponential trend line and forecast the line for 30 years ahead. Note the following:
- Coefficient of Determination (r2)
- Exponential regression equation
- Population predicted for 2000
- Population predicted for 2010
- Population predicted for 2020
-
Create another similar scatter plot using the same data. Add a
2nd degree polynomial trend line (quadratic) and forecast the line for 30 years ahead. Note the following:
- Coefficient of Determination (r2)
- Quadratic regression equation
- Population predicted for 2000
- Population predicted for 2010
- Population predicted for 2020
-
Based on the Coefficient of Determinations (r2), which function best models population growth during this period?
-
Check your population predictions on the
IDB Summary Demographic Data for the U.S. for the years 2000, 2010, and 2020. Which function most accurately predicts the populations for these years?
-
Using either the equation or graphical interpretation for the function you think best models population growth during this period, predict the population for today's date.
-
Check your prediction for today's population on the
U.S. Population Clock. How accurate was your prediction?
-
Optional: Plot the residuals for the linear, exponential, and quadratic models you created. Residuals (or statistical errors) help to determine if the model is a good fit to the data. A scatter plot that shows residuals uniformly close to the x-axis indicates a good fit to the data.
| Residual = observed value - predicted value |
For example, for the year 1900 the residual would be the population given by the U.S. Census Bureau's data minus the population predicted by your line of best fit for that year and mathematical model. For each of the three models, find the residuals for each decade's population data and then plot the year on the x-axis and corresponding residual on the y-axis. You will have three scatter plots. The plot that shows the residuals most like a horizontal line at y = 0 indicates that this model is probably the best model for the data.
-
Optional: Look at your scatter plot of the population data again. Might 2 or more exponential models be combined to accurately model the population growth? Try plotting the residuals for different periods of time (e.g. 1790-1880 and 1880-1990) to see if the combination of different exponential models might result in a good fit for this data.