Lesson 4Fitting a Line to Data

Learning Goal

Let’s look at the scatter plots as a whole.

Learning Targets

  • I can pick out outliers on a scatter plot.

  • I can use a model to predict values for data.

Lesson Terms

  • outlier

Warm Up: Predict This

Problem 1

Here is a scatter plot that shows weights and fuel efficiencies of 20 different types of cars.

If a car weighs 1,750 kg, would you expect its fuel efficiency to be closer to 22 mpg or to 28 mpg? Explain your reasoning.

A scatterplot with 20 data points. The horizontal axis is labeled “weight, in kilograms” and the numbers 1,000 through 2,500, in increments of 250, are indicated. The vertical axis is labeled “fuel efficiency, in miles per gallon” and the numbers 14 through 32, in increments of 2, are indicated. The graph shows the trend of the 20 data points moving linearly downward and to the right. The approximate coordinates of 11 selected data points are as follows:  1,130 comma 28. 1,240 comma 30. 1,400 comma 25. 1,490 comma 23. 1,550 comma 25. 1,590 comma 26. 1,650 comma 19. 1,740 comma 21. 1,775 comma 20. 1,950 comma 19. 2,200 comma 16.

Activity 1: Shine Bright

Problem 1

The prices and sizes of 20 different diamonds are shown in the table and scatter plot.

weight (carats)

actual price (dollars)

predicted price (dollars)

The function described by the equation is a model of the relationship between a diamond’s weight and its price.

The scatter plot shows the prices and weights of the 20 diamonds together with the graph of .

This model predicts the price of a diamond from its weight. These predicted prices are shown in the third column of the table.

  1. Two diamonds with a weight of 1.5 carats have different prices. What are their prices? How can you see this in the table? How can you see this in the graph?

  2. The model predicts that when the weight is 1.5 carats, the price will be $7,189. How can you see this in the graph? How can you see this using the equation?

  3. One of the diamonds weighs 1.9 carats. What does the model predict for its price? How does that compare to the actual price?

  4. Find a diamond where the model makes a very good prediction of the actual price. How can you see this in the table? In the graph?

  5. Find a diamond where the model’s prediction is not very close to the actual price. How can you see this in the table? In the graph?

Print Version

Here is a table that shows weights and prices of 20 different diamonds.

weight (carats)

actual price (dollars)

predicted price (dollars)

The scatter plot shows the prices and weights of the 20 diamonds together with the graph of .

A scatter plot of weight (carats) (horizontal from 0.9-2.1 in 0.12 increments) and price (dollars (vertical from 2000-11000). A line of best fit goes up and to the right.

The function described by the equation is a model of the relationship between a diamond’s weight and its price.

This model predicts the price of a diamond from its weight. These predicted prices are shown in the third column of the table.

  1. Two diamonds that both weigh 1.5 carats have different prices. What are their prices? How can you see this in the table? How can you see this in the graph?

  2. The model predicts that when the weight is 1.5 carats, the price will be $7,189. How can you see this in the graph? How can you see this using the equation?

  3. One of the diamonds weighs 1.9 carats. What does the model predict for its price? How does that compare to the actual price?

  4. Find a diamond for which the model makes a very good prediction of the actual price. How can you see this in the table? In the graph?

  5. Find a diamond for which the model’s prediction is not very close to the actual price. How can you see this in the table? In the graph?

Activity 2: The Agony of the Feet

Problem 1

The scatter plot shows widths and lengths of 20 different left feet. Use the double arrows to show or hide the expressions list.

  1. Estimate the widths of the longest foot and the shortest foot.

  2. Estimate the lengths of the widest foot and the narrowest foot.

  3. Click on the gray circle next to the words “The Line” in the expressions list.

    An icon labeled the gray circle.

    The graph of a linear model should appear. Find the data point that seems weird when compared to the model. What length and width does that point represent?

Print Version

Here is a scatter plot that shows lengths and widths of 20 different left feet.

A scatter plot of foot length (cm) (horizontal from 20-32) and foot width (cm) (vertical from 7 - 12).
  1. Estimate the widths of the longest foot and the shortest foot.

  2. Estimate the lengths of the widest foot and the narrowest foot.

  3. Here is the same scatter plot together with the graph of a model for the relationship between foot length and width.

    The same scatter plot with a line of best fit. There is a data point significantly off the line at approximately (24.3, 7.5)

    Circle the data point that seems weird when compared to the model. What length and width does that point represent?

Lesson Summary

Sometimes, we can use a linear function as a model of the relationship between two variables. For example, here is a scatter plot that shows heights and weights of 25 dogs together with the graph of a linear function which is a model for the relationship between a dog’s height and its weight.

A scatter plot and line of best fit of dog height (in) (horizontal from 6-30) vs dog weight (pounds) (vertical from 0-112). The data trends up and towards the right.

We can see that the model does a good job of predicting the weight given the height for some dogs. These correspond to points on or near the line. The model doesn’t do a very good job of predicting the weight given the height for the dogs whose points are far from the line.

For example, there is a dog that is about 20 inches tall and weighs a little more than 16 pounds. The model predicts that the weight would be about 48 pounds. We say that the model overpredicts the weight of this dog. There is also a dog that is 27 inches tall and weighs about 110 pounds. The model predicts that its weight will be a little less than 80 pounds. We say the model underpredicts the weight of this dog.

Sometimes a data point is far away from the other points or doesn’t fit a trend that all the other points fit. We call these outliers.