Lesson 5Describing Trends in Scatter Plots

Learning Goal

Let’s look for associations between variables.

Learning Targets

  • I can draw a line to fit data in a scatter plot.

  • I can say whether data in a scatter plot has a positive or negative association (or neither).

Lesson Terms

  • negative association
  • outlier
  • positive association

Warm Up: Which One Doesn’t Belong: Scatter Plots

Problem 1

Which one doesn’t belong?

  1. A scatter plot with a horizontal axis of 0 to 12 and a vertical axis of 0 to 80. A line of best fit goes up and to the right.
  2. A scatter plot with a horizontal axis of 0 to 12 and a vertical axis of -50 to 50. A line of best fit goes down and to the right.
  3. A scatter plot with a horizontal axis of 0 to 12 and a vertical axis of 0 to 100. A line of best fit goes down and to the right.
  4. A scatter plot with a horizontal axis of 0 to 12 and a vertical axis of 0 to 80. There is not a line of best fit and dots do not have a strong correlation.

Activity 1: Fitting Lines

Problem 1

Experiment with finding lines to fit the data. Drag the points to move the line. You can close the expressions list by clicking on the double arrow.

  1. Here is a scatter plot. Experiment with different lines to fit the data. Pick the line that you think best fits the data. Compare it with a partner’s.

  2. Here is a different scatter plot. Experiment with drawing lines to fit the data. Pick the line that you think best fits the data. Compare it with a partner’s.

  3. In your own words, describe what makes a line fit a data set well.

Print Version

Your teacher will give you a piece of pasta and a straightedge.

  1. Here are two copies of the same scatter plot. Experiment with drawing lines to fit the data. Pick the line that you think best fits the data. Compare it with a partner’s.

    A scatterplot of 20 data values. The horizontal axis has the numbers 0 through 12, in increments of 3, indicated. The vertical axis has the numbers 0 through 40, in increments of 10, indicated. The graph shows the trend of the 20 data points moving linearly downward and to the right. The approximate coordinates of the points are as follows:  1 comma 38. 1 point 5 comma 35. 2 point 3 comma 33. 2 point 5 comma 31 point 5. 3 point 2 comma 32. 4 comma 31.  4 point 4 comma 27. 4 point 5 comma 25. 5 point 8 comma 26. 6 comma 26. 6 comma 22. 6 point 3 comma 18. 7 point 5 comma 14. 7 point 6 comma 21. 8 point 5 comma 17. 9 comma 14. 9 point 3 comma 8. 9 point 5 comma 12. 10 point 4 comma 4. 10 point 5 comma 7.
    A scatterplot of 20 data values. The horizontal axis has the numbers 0 through 12, in increments of 3, indicated. The vertical axis has the numbers 0 through 40, in increments of 10, indicated. The graph shows the trend of the 20 data points moving linearly downward and to the right. The approximate coordinates of the points are as follows:  1 comma 38. 1 point 5 comma 35. 2 point 3 comma 33. 2 point 5 comma 31 point 5. 3 point 2 comma 32. 4 comma 31.  4 point 4 comma 27. 4 point 5 comma 25. 5 point 8 comma 26. 6 comma 26. 6 comma 22. 6 point 3 comma 18. 7 point 5 comma 14. 7 point 6 comma 21. 8 point 5 comma 17. 9 comma 14. 9 point 3 comma 8. 9 point 5 comma 12. 10 point 4 comma 4. 10 point 5 comma 7.
  2. Here are two copies of another scatter plot. Experiment with drawing lines to fit the data. Pick the line that you think best fits the data. Compare it with a partner’s.

    A scatterplot of 20 data points. The horizontal axis has the numbers 0 through 12, in increments of 3, indicated. The vertical axis has the numbers 0 through 40, in increments of 10, indicated. The graph shows the trend of the 20 data points moving linearly upward and to the right. The approximate coordinates of the points are as follows: 1 point 5 comma 1. 2 comma 5. 3 comma 1. 3 comma 7. 3 comma 8. 4 comma 4. 4 point 4 comma 7.5. 4 point 6 comma 10. 5 comma 2. 5 point 5 comma 13. 6 point 2 comma 22. 7 point 4 comma 18. 7 point 5 comma 11. 7 point 5 comma 9. 8 point 8 comma 29. 8 point 9 comma 13. 9 point 1 comma 33. 9 point 7 comma 28. 10 comma 16. 10 comma 17.
    A scatterplot of 20 data points. The horizontal axis has the numbers 0 through 12, in increments of 3, indicated. The vertical axis has the numbers 0 through 40, in increments of 10, indicated. The graph shows the trend of the 20 data points moving linearly upward and to the right. The approximate coordinates of the points are as follows: 1 point 5 comma 1. 2 comma 5. 3 comma 1. 3 comma 7. 3 comma 8. 4 comma 4. 4 point 4 comma 7.5. 4 point 6 comma 10. 5 comma 2. 5 point 5 comma 13. 6 point 2 comma 22. 7 point 4 comma 18. 7 point 5 comma 11. 7 point 5 comma 9. 8 point 8 comma 29. 8 point 9 comma 13. 9 point 1 comma 33. 9 point 7 comma 28. 10 comma 16. 10 comma 17.
  3. In your own words, describe what makes a line fit a data set well.

Activity 2: Good Fit Bad Fit

The scatter plots both show the year and price for the same 17 used cars. However, each scatter plot shows a different model for the relationship between year and price.

Scatterplot A shows a line of best fit going through or close to most dots. Scatterplot B shows line of best fit above most of the data points.

Problem 1

Look at Diagram A.

  1. For how many cars does the model in Diagram A make a good prediction of its price?

  2. For how many cars does the model underestimate the price?

  3. For how many cars does it overestimate the price?

Problem 2

Look at Diagram B.

  1. For how many cars does the model in Diagram B make a good prediction of its price?

  2. For how many cars does the model underestimate the price?

  3. For how many cars does it overestimate the price?

Problem 3

For how many cars does the prediction made by the model in Diagram A differ by more than $3,000? What about the model in Diagram B?

Problem 4

Which model does a better job of predicting the price of a used car from its year?

Activity 3: Practice Fitting Lines

Problem 1

Is this line a good fit for the data? Explain your reasoning.

A scatter plot with a line of best fit that has 10 points on or close to the line and 10 points significantly below the line.

Problem 2

Draw a line that fits the data better.

The same graph as prob. 1, but without the line of best fit.

Problem 3

Is this line a good fit for the data? Explain your reasoning.

A scatter plot with a line of best fit. The data is in 3 groups. The line is below the first group, touching and a little above the second group and above the third group.

Problem 4

Draw a line that fits the data better.

Same graph as prob 3 but without the line of best fit.

Are you ready for more?

Problem 1

These scatter plots were created by multiplying the -coordinate by 3 then adding a random number between two values to get the -coordinate. The first scatter plot added a random number between -0.5 and 0.5 to the -coordinate. The second scatter plot added a random number between -2 and 2 to the -coordinate. The third scatter plot added a random number between -10 and 10 to the -coordinate.

  1. For each scatter plot, draw a line that fits the data.

    A scatter plot with data showing a positive relationship. The data is close together in a straight line.
    A scatter plot with data without any association.
    A scatter plot with data more random but still with a positive relationship.
  2. Explain why some were easier to do than others.

Lesson Summary

When a linear function fits data well, we say there is a linear association between the variables. For example, the relationship between height and weight for 25 dogs with the linear function whose graph is shown in the scatter plot.

A scatter plot and line of best fit of dog height (in) (horizontal from 6-30) vs dog weight (pounds) (vertical from 0-112). The data trends up and towards the right.

Because the model fits the data well and because the slope of the line is positive, we say that there is a positive association between dog height and dog weight.

What do you think the association between the weight of a car and its fuel efficiency is?

A scatter plot of weight (kg) vs fuel efficiency (mpg). The line of best fit slopes down and to the right.

Because the slope of a line that fits the data well is negative, we say that there is a negative association between the fuel efficiency and weight of a car.