This lesson plan utilizes candy, sports, and air pollution to explore scatterplots and correlation of variables. It is adapted from Section 3.1 in __The Practice of Statistics Sixth Edition__ by Starnes and Tabor.

**Author**

Margaret Borden

- Introduction
- Curriculum Alignment
- Objectives
- Time & Location
- Teacher Materials
- Student Materials
- Student Prior Knowledge
- Teacher Preparations
- Activities
- Assessment
- Critical Vocabulary
- Author Information

## Introduction

Students will build scatterplots representative of a variety of different data: candy versus hand span, diamonds versus the SAT, payroll versus wins, height versus amount of sleep, sugar versus calories, and deaths versus air pollution and heat. They will learn how to describe the plots: the explanatory and response variables, the direction and strength of the association, and how to find the correlation coefficient.

## Curriculum Alignment

S-ID.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.

## Objectives

The objectives list what students are expected to learn after completing the lesson plan.

- Students will be able to distinguish between explanatory and response variables for quantitative data.
- Students will be able to make a scatterplot to display the relationship between two quantitative variables.
- Students will be able to describe the direction, form, and strength of a relationship displayed in a scatterplot and identify unusual features.
- Students will be able to interpret the correlation.

## Time & Location

Classroom

One 90-minute period

## Teacher Materials

__The Practice of Statistics Sixth Edition__ by Starnes and Tabor (can use other editions, but page numbers will be associated with this edition

Bowl full of wrapped candies

Whiteboard

Whiteboard markers

Projector

Computer

Doc Cam (for the discussions, showing student work, if desired)

## Student Materials

Graph paper

Notebook paper

Graphing Calculators

Pencils

Rulers

## Student Prior Knowledge

Students should be able to plot coordinate points on a graph.

Students should be able to choose appropriate scales for graph axes.

## Teacher Preparations

Teacher needs to have a bowl of wrapped candies out in a prominent location, and rulers need to be readily available to students.

On the board, teacher should have the instructions from the “Candy Grab” activity on page 152.

Teacher should also have a table with the headings “Hand span (cm)” and “Number of candies” written on the board beside the instructions, as shown in the activity.

Students can be grouped however you like; I tend to prefer groups of two.

Teacher should either have enough books for each group to use, enough photocopies for each group to use, a projection of the tables/scatterplots/diagrams for students to look at in class, or an accessible electronic copy for students to use.

Have the *Guess the Correlation* applet loaded (www.rossmanchance.com/applets) [activity described on p.161]

## Activities

10 min: Candy Grab activity (p.152, the first page of Chapter 3)

7 min: Show https://www.cbsnews.com/news/pollution-makes-air-in-parts-of-california-dangerous-to-breathe/, provide http://www.latimes.com/nation/la-na-utah-smog-2017-story.html, provide https://www.aging.ca.gov/data_and_statistics/facts_about_elderly/, https://blissair.com/what-is-pm-2-5.htm, and tell them that we have a 13 year data set that shows us the daily death counts (all, all over 65, all between 65 and 74, all over 75, each of those options minus accidents, each of these options for circulatory deaths, each of these options for respiratory deaths), PM2.5 daily average (and the PM2.5davg from 1 day before, 2 days before, and 3 days before), Ozone daily average, minimum temperature, and maximum temperature.

5 min: Open up https://padlet.com/mcleak/deathandpollution (password: apstats) or create your own padlet and have students type in ANY curiosity, question, wonder, concern, worry, research, or learning desire that pops in their head. Encourage them to like other people’s post-its and comment on other people’s post-its.

5 min: Hand each group one of the South Coast data sets and have them create a scatterplot using that information

5 min: Take notes on explanatory and response variable (p.153); do this as you desire: could be the students read it from the book or handout, could be a PowerPoint projection, could be a discussion, could be written on the board, could be guided notes, however you like to give notes for your students. I would focus on what the words “explanatory” and “response” mean in their everyday lives so that they can make sense of the fact that the explanatory variable *explains* how the response variable is *responding*, and show them where those names came from in the first place.

2 min: Label explanatory and response variable for all of the scatterplots we’ve already done, plus the example on p. 154

5 min: Discuss as a class what the answers are to the explanatory and response variables (have someone from each group describe how they thought about it to the whole class, see if you can include different ideas or thought processes from when you were walking around). I advise using the random number generator on your calculator to pick who speaks, since that gets them used to randomization early on AND encourages them to work hard to be ready to answer, since it is perceived as fair, and it is unpredictable. (I have my tables in groups of four, so I generate numbers 1-4).

5 min: Students take notes on how to make a scatterplot (p.154-155). Make sure to point out here that statisticians only pay attention to the section of the scatterplot that is relevant to them, so the numbers on the axes could start anywhere.

5 min: Students take notes on how to describe a scatterplot (p.156-157). (form, direction, outliers, and strength)

5 min: Quick Application: Students describe the scatterplots we already made, describe the scatterplots on p.156 -p.158 (independently or as a group, while the teacher walks around the room).

5 min: Discuss how they were described (have someone from each group describe how they thought about it to the whole class, see if you can include different ideas or thought processes from when you were walking around).

10 min: p.159, technology corner. Teach students how to find a scatterplot on their calculators.

Ask:

- Why would we want to do this in a calculator?
- Are there limitations to what we can do in the calculator?
- Can you think of any other technologies that would be better to learn on for those limitations?

5 min: take basic correlation notes on p.160-161. It is important to note here that the correlation can only describe direction (negative/positive) and strength (weak, moderately weak, none, moderately strong, strong) of the linear relationship between the variables. The number can’t say “72% strength” or “72% of the data is…” or anything like that, unfortunately. This is tough for students at first, but they get used to it as they practice.

10 min: play *Guess the Correlation* as a whole class (activity instructions are on p.161), after each guess, have students describe why they guessed that number, and discuss important pieces to be looking for on the graph to help with what the correlation might be.

6 min: Exit Ticket (Check Your Understanding on p.162; add the instruction: label the explanatory and response variables; when grading, look for the description words they learned earlier) and have students write down the which variables they would be interested in exploring from the air pollution.

HW: Choose from Section 3.1 Exercises p.171-175

## Assessment

Exit Ticket (Check Your Understanding on p.162; add the instruction: label the explanatory and response variables; when grading, look for the description words they learned earlier)

Assesses Objectives:

- Students will be able to distinguish between explanatory and response variables for quantitative data.
- Students will be able to describe the direction, form, and strength of a relationship displayed in a scatterplot and identify unusual features.
- Students will be able to interpret the correlation.

Formative assessment (basically all of the activities)

- Specifically, the “make a scatterplot using these differing data sets from South Coast CA” assesses the “Students will be able to make a scatterplot to display the relationship between two quantitative variables.”
- Suggested questions:
- How did you decide which variable was which?
- How could knowing the strength and direction be useful to a statistician?
- What forms would concern you and why?
- What did you consider when guessing the correlation coefficient?

Were there any correlations that you were surprised about? Why do you think that occurred?

## Critical Vocabulary

Response Variable: measures an outcome of a study

Explanatory Variable: may help predict or explain changes in a response variable

Scatterplot: shows the relationship between two quantitative variables measured on the same individuals.

Positive association: when the above-average values of one variable tend to accompany above-average values of the other variable and when below-average values also tend to occur together

Negative association: when above-average values of one variable tend to accompany below-average values of another variable

No association: if knowing the value of one variable does not help us predict the value of another variable

Direction: a scatterplot can show a positive association, a negative association, or no association

Form: a scatterplot can show a linear form (follows a straight line) or a nonlinear form

Strength: a scatterplot can show a weak, moderate or strong association (if the points don’t deviate much from the form identified)

Unusual features: outliers that fall outside the overall pattern and distinct clusters of points

Correlation (r): measures the direction and strength of a linear association between two quantitative variables

## Author Information

In this section, tell us about yourself and your mentor! Include the following:

**Margaret Borden:**

- Knightdale High School of Collaborative Design, Wake County Public School System, Raleigh
- 9-12 Math
- Second year
- mcleak@ncsu.edu; mborden@wcpss.net

**Richard Smith:**

- SAMSI and UNC Chapel Hill Department of Statistics and Operations Research
- Statistical Research
- Mark L. Reed III Distinguished Professor, Director of SAMSI, http://www.unc.edu/~rls/