Using Sample Data to Decide if Two Population Means Are Different
Video solutions to help Grade 7 students learn how to use data from random samples to draw informal inferences about the difference in population means.
Plans and Worksheets for Grade 7
Plans and Worksheets for all Grades
Lessons for Grade 7
Common Core For Grade 7
New York State Common Core Math Grade 7, Module 5, Lesson 23
Lesson 23 Student Outcomes
• Students use data from random samples to draw informal inferences about the difference in population
Lesson 23 Summary
To determine if the mean value of some numerical variable differs for two populations, take random samples from
each population. It is very important that the samples be random samples. If the number of MADs that separate
the two sample means is 2 or more, then it is reasonable to think that the populations have different means.
Otherwise, the population means are considered to be the same.
Lesson 23 Classwork
In the previous lesson, you described how far apart the means of two data sets are in terms of the MAD (mean absolute
deviation), a measure of variability. In this lesson, you will extend that idea to informally determine when two sample
means computed from random samples are far enough apart from each other so to imply that the population means also
differ in a “meaningful” way. Recall that a “meaningful” difference between two means is a difference that is greater
than would have been expected just due to sampling variability.
Example 1: Texting
With texting becoming so popular, Linda wanted to determine if middle school students memorize real words more or
less easily than fake words. For example, real words are “food,” “car,” “study,” “swim;” whereas fake words are “stk,”
“fonw,” “cqur,” “ttnsp.” She randomly selected 28 students from all middle school students in her district and gave half
of them a list of 20 real words and the other half a list of 20 fake words.
1. How do you think Linda might have randomly selected 28 students from all middle school students in her district?
2. Why do you think Linda selected the students for her study randomly? Explain.
3. She gave the selected students one minute to memorize their list after which they were to turn the list over and
after two minutes write down all the words that they could remember. Afterwards, they calculated the number of
correct “words” that they were able to write down. Do you think a penalty should be given for an incorrect “word”
written down? Explain your reasoning.
Suppose the data (number of correct words recalled) she collected were:
4. On the same scale, draw dot plots for the two data sets.
5. From looking at the dot plots, write a few sentences comparing the distribution of the number of correctly recalled
real words with the distribution of number of correctly recalled fake words. In particular, comment on which type
of word, if either, that students recall better. Explain.
6. Linda made the following calculations for the two data sets:
In the previous lesson, you calculated the number of MADs that separated two sample means. You used the larger
MAD to make this calculation if the two MADs were not the same. How many MADs separate the mean number of
real words recalled and the mean number of fake words recalled for the students in the study?
7. In the last lesson, our work suggested that if the number of MADs that separate the two sample means is 2 or more,
then it is reasonable to conclude that not only do the means differ in the samples, but that the means differ in the
populations as well. If the number of MADs is less than 2, then you can conclude that the difference in the sample
means might just be sampling variability and that there may not be a meaningful difference in the population
means. Using these criteria, what can Linda conclude about the difference in population means based on the
sample data that she collected? Be sure to express your conclusion in the context of this problem.
Ken, an eighth grade student, was interested in doing a statistics study involving sixth grade and eleventh grade students
in his school district. He conducted a survey on four numerical variables and two categorical variables (grade level and
gender). His Excel population database for the 265 sixth graders and 175 eleventh graders in his district has the
8. Ken decides to base his study on a random sample of20 sixth graders and a random sample of eleventh graders.
The sixth graders have IDs 1 – 265 and the eleventh graders are numbered 266 – 440. Advise him on how to
randomly sample 20 sixth graders and 20 eleventh graders from his data file.
Suppose that from a random-number
9. For each set, find the homework hours data from the population database that corresponds to these randomly
selected ID numbers.
10. On the same scale, draw dot plots for the two sample data sets.
11. From looking at the dot plots, list some observations comparing the number of hours per week that sixth graders
spend on doing homework and the number of hours per week that eleventh graders spend on doing homework.
12. Calculate the mean and MAD for each of the data sets. How many MADs separate the two sample means? (Use the
larger MAD to make this calculation if the sample MADs are not the same.)
13. Ken recalled Linda suggesting that if the number of MADs is greater than or equal to then it would be reasonable
to think that the population of all sixth grade students in his district and the population of all eleventh grade
students in his district have different means. What should Ken conclude based on his homework study?