# Using Sample Data to Decide if Two Population Means Are Different

Examples, videos, and solutions to help Grade 7 students learn how to use data from random samples to draw informal inferences about the difference in population means.

### Lesson 23 Student Outcomes

• Students use data from random samples to draw informal inferences about the difference in population means.

### Lesson 23 Summary

To determine if the mean value of some numerical variable differs for two populations, take random samples from each population. It is very important that the samples be random samples. If the number of MADs that separate the two sample means is 2 or more, then it is reasonable to think that the populations have different means. Otherwise, the population means are considered to be the same.

Lesson 23 Classwork

In the previous lesson, you described how far apart the means of two data sets are in terms of the MAD (mean absolute deviation), a measure of variability. In this lesson, you will extend that idea to informally determine when two sample means computed from random samples are far enough apart from each other so to imply that the population means also differ in a “meaningful” way. Recall that a “meaningful” difference between two means is a difference that is greater than would have been expected just due to sampling variability.

Example 1: Texting
With texting becoming so popular, Linda wanted to determine if middle school students memorize real words more or less easily than fake words. For example, real words are “food,” “car,” “study,” “swim;” whereas fake words are “stk,” “fonw,” “cqur,” “ttnsp.” She randomly selected 28 students from all middle school students in her district and gave half of them a list of 20 real words and the other half a list of 20 fake words.

Exercises 1–3

1. How do you think Linda might have randomly selected 28 students from all middle school students in her district?
2. Why do you think Linda selected the students for her study randomly? Explain.
3. She gave the selected students one minute to memorize their list after which they were to turn the list over and after two minutes write down all the words that they could remember. Afterwards, they calculated the number of correct “words” that they were able to write down. Do you think a penalty should be given for an incorrect “word” written down? Explain your reasoning.

Exercises 4–7
Suppose the data (number of correct words recalled) she collected were:
4. On the same scale, draw dot plots for the two data sets.
5. From looking at the dot plots, write a few sentences comparing the distribution of the number of correctly recalled real words with the distribution of number of correctly recalled fake words. In particular, comment on which type of word, if either, that students recall better. Explain.
6. Linda made the following calculations for the two data sets:
In the previous lesson, you calculated the number of MADs that separated two sample means. You used the larger MAD to make this calculation if the two MADs were not the same. How many MADs separate the mean number of real words recalled and the mean number of fake words recalled for the students in the study?
7. In the last lesson, our work suggested that if the number of MADs that separate the two sample means is 2 or more, then it is reasonable to conclude that not only do the means differ in the samples, but that the means differ in the populations as well. If the number of MADs is less than 2, then you can conclude that the difference in the sample means might just be sampling variability and that there may not be a meaningful difference in the population means. Using these criteria, what can Linda conclude about the difference in population means based on the sample data that she collected? Be sure to express your conclusion in the context of this problem.

Example 2
Ken, an eighth grade student, was interested in doing a statistics study involving sixth grade and eleventh grade students in his school district. He conducted a survey on four numerical variables and two categorical variables (grade level and gender). His Excel population database for the 265 sixth graders and 175 eleventh graders in his district has the following description:

Exercise 8
8. Ken decides to base his study on a random sample of20 sixth graders and a random sample of eleventh graders. The sixth graders have IDs 1 – 265 and the eleventh graders are numbered 266 – 440. Advise him on how to randomly sample 20 sixth graders and 20 eleventh graders from his data file.
Exercises 9–13
Suppose that from a random-number
9. For each set, find the homework hours data from the population database that corresponds to these randomly selected ID numbers.
10. On the same scale, draw dot plots for the two sample data sets.
11. From looking at the dot plots, list some observations comparing the number of hours per week that sixth graders spend on doing homework and the number of hours per week that eleventh graders spend on doing homework.
12. Calculate the mean and MAD for each of the data sets. How many MADs separate the two sample means? (Use the larger MAD to make this calculation if the sample MADs are not the same.)
13. Ken recalled Linda suggesting that if the number of MADs is greater than or equal to then it would be reasonable to think that the population of all sixth grade students in his district and the population of all eleventh grade students in his district have different means. What should Ken conclude based on his homework study?

Try the free Mathway calculator and problem solver below to practice various math topics. Try the given examples, or type in your own problem and check your answer with the step-by-step explanations. 