Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Data Analysis Data Analysis Basics Getting to Know Your Data Analyzing Data Spread

Standard deviation getting confusing

I was just starting to get into this course, and I enjoy working with data, but im getting really confused about this standard deviation thing, its a shame because I was really enjoying the course and understanding until now

1 Answer

Hi Nick,

So for standard deviation, you want to find out how far away from the average each piece of datum is. So first you have to find the average. Then you take each datum and subtract the average from that.

When he squares the resulting number, what is happening is that he is finding the absolute value of that number. You do not want to have negative number for standard deviation because that would not make any sense.

After that, you find the average for that new data set. Add up all the numbers you have just finished calculating, and divide by the number of values. That new number is the squared version of the standard deviation, because you squared your first calculated number at the beginning. So you take the square root to reverse that.

The standard deviation tells you how spread out the data is.

Let's work through an example to calculate the standard deviation: Say you have some data with the numbers - 2, 5, 5, 9, 10, 11, 11, 12, 25, 40

  1. We take the average of those numbers: (2+5+5+9+10+11+11+12+25+40) / 10 = 13
  2. Now for every datum we subtract the average from them. This tells us how each datum relates to the average, whether it is sitting below or above it, and how far away it is.: 2 - 13 = -11 5 - 13 = -8 5 - 13 = -8 9 - 13 = -4 10 - 13 = -3 11 - 13 = -2 11 - 13 = -2 12 - 13 = -1 25 - 13 = 12 40 - 13 = 27
  3. We want to get rid of any negatives, so we square everything. (-11)^2 = 121 (-8)^2 = 64 (-8)^2 = 64 (-4)^2 = 16 (-3)^2 = 9 (-2)^2 = 4 (-2)^2 = 4 (-1)^2 = 1 (12)^2 = 144 (40)^2 = 1600
  4. Let's take the average of those numbers we just found in #3: (121+64+64+16+9+4+4+1+144+1600) / 10 = 202.7
  5. Here, we have to remember that we squared the numbers in #2, so we have to reverse what we did by taking the square root of the number from #4 sqrt(202.7) = 14.24 (rounded to the nearest tenth)

Luckily when we are using our spreadsheets we don't need to calculate all of that out. We just need to use the formula to find that number: =STDEV()

Here's a link working through standard deviation as a concept: (https://www.youtube.com/watch?v=MRqtXL2WX2M)