Bummer! This is just a preview. You need to be signed in with a Basic account to view the entire video.
Start a free Basic trial
to watch this video
Lots of things end up being normally distributed. Are Boston Marathon results one of them?

0:00
Let's try and find out if our data is normally distributed by seeing how many

0:04
finishers finished within one, two, and three standard deviations of the mean.

0:09
But first, we'll need to know how many finishers there were.

0:13
Let's add a row at the very top by right clicking on row 1 and

0:16
choosing insert 1 above.

0:19
Then let's add a label for number of finishers and make sure it's bold.

0:26
Then in cell B1, let's type =COUNT,

0:30
paste in our range of overall finish times, and hit Enter.

0:35
And there we go, 26,410 total finishers.

0:41
Getting back to our standard deviations,

0:45
let's add three labels below our standard deviation label,

0:52
and call them % in 1, % in 2, and % in 3.

0:57
And let's leave them unbolded so

0:59
they look like they belong with standard deviation, because they do.

1:04
Now, for % in 1, we need to find out haw many runners finished within 1

1:09
standard deviation of the mean.

1:12
To accomplish this, we're going to use the COUNTIFS function, which lets us give some

1:16
criteria and then only returns the count of values that match our criteria.

1:21
We're going to count only runners that finished within 1 standard deviation.

1:26
And then divide that by the total number of runners to get a percentage.

1:30
Over in cell B11, let's type =COUNTIFS and hit Enter to select it.

1:39
Then let's paste in the range of finishing times and add a comma.

1:43
The next parameter is the conditional statement.

1:46
And it's entered as a string.

1:49
So let's add two quotation marks and in the middle, let's add a greater than sign.

1:56
To find out if a runner is within 1 standard deviation of the mean, we need to

2:00
check that their finishing time is greater than the mean minus 1 standard deviation.

2:07
Unfortunately, this data exists in a cell.

2:10
So instead of typing the data in, we should reference the cell directly.

2:15
To do this, we need to combine our greater than sign

2:18
with our cell data by using an ampersand to concatenate the strings.

2:23
Let's add an ampersand after the last quotation mark.

2:27
Then let's select the average, type a minus sign and

2:31
then select the standard deviation.

2:33
We're now counting all runners greater than 1 standard deviation below the mean.

2:40
So to finish up counting all the runners within 1 standard deviation, we just need

2:44
to add a criteria that they finished under 1 standard deviation above the mean,

2:49
as well.

2:51
To do this, let's just copy the range and criteria that we just entered,

2:56
add a comma, and then paste them back in.

3:00
Finally, we just need to change this greater than sign to a less than sign,

3:06
and change this minus to a plus.

3:11
And add a closing parentheses.

3:14
For our last step, to turn this into a percentage we just need to divide it by

3:19
the total number of finishers.

3:24
Which gives us about 69.47%,

3:27
which is pretty close to the 68 of a normal distribution.

3:34
And to make it look like a percent, we can click up here and then choose percent.

3:39
From here, we can find our other standard deviation percentages pretty easily.

3:43
But first, let's use F4 to make all the references in this formula absolute.

3:50
This way, when we drag the cell down, it'll keep the same references.

4:03
Then let's drag the cell down twice.

4:08
And to get the % in 2 and 3, inside the formula for those cells,

4:13
we just need to multiply the standard deviation by 2 or 3 respectively.

4:19
And the standard deviation for me is this tealcolored B10.

4:24
So for % in 2, we'll multiply this by 2.

4:28
And over here we'll multiply it by 2.

4:32
And for % in 3 we'll do the same thing, except with 3.

4:41
All right, we've got 69.48,

4:45
94.91, and then 99.76%.

4:50
Remember, a normal distribution should be about 68% within 1 standard deviation,

4:58
95% within 2, and 99.7% within 3.

5:02
So it looks like the finishing times of runners in the Boston Marathon

5:06
are pretty close to normally distributed.

5:09
Coming up in the next video,

5:10
we'll talk about the many different flavors of data visualization.
You need to sign up for Treehouse in order to download course files.
Sign up