1 00:00:00,140 --> 00:00:02,740 We could teach a whole stage about correlation. 2 00:00:02,740 --> 00:00:06,100 For sure, lot's of analysts out there are constantly debating 3 00:00:06,100 --> 00:00:09,640 how to think about correlation versus causation. 4 00:00:09,640 --> 00:00:11,260 For the sake of this course, 5 00:00:11,260 --> 00:00:16,520 just know that correlation measures the relationship between two or more things. 6 00:00:16,520 --> 00:00:20,349 It is a measurement of the interdependence of variable quantities. 7 00:00:21,470 --> 00:00:25,950 These numbers are expressed in a range between 1 and -1. 8 00:00:25,950 --> 00:00:30,060 So, if you calculate the correlation of two variables, and 9 00:00:30,060 --> 00:00:33,950 the correlation is 1, that is a perfect correlation. 10 00:00:35,150 --> 00:00:39,633 If it is around 0.5, that's a fairly low correlation. 11 00:00:39,633 --> 00:00:43,328 At 0 there's no correlation. 12 00:00:43,328 --> 00:00:47,510 -1 will be a perfect negative correlation. 13 00:00:49,050 --> 00:00:53,610 Let's calculate the correlation of height and weight in our dataset. 14 00:00:53,610 --> 00:00:57,630 So here we are back in our spreadsheet that has data on height and weight. 15 00:00:58,940 --> 00:01:05,217 You can use a CORREL function to calculate the correlation between height and weight. 16 00:01:05,217 --> 00:01:11,177 So, I'm going to hit =, type in COR, there we go, 17 00:01:11,177 --> 00:01:15,787 we got the CORREL function right there. 18 00:01:15,787 --> 00:01:21,870 Parentheticals, array 1, I'm gonna go over to B2. 19 00:01:21,870 --> 00:01:25,520 I'm gonna hold down Shift + Ctrl and press the down arrow. 20 00:01:27,110 --> 00:01:28,010 Then I'm gonna do comma. 21 00:01:30,030 --> 00:01:33,870 I press left arrow, now I'm in C2 I'm gonna hold down Shift + Ctrl and 22 00:01:33,870 --> 00:01:35,560 press down again. 23 00:01:35,560 --> 00:01:41,106 And then I'm gonna close the parentheticals, Return, 24 00:01:43,032 --> 00:01:48,850 And here we go, 0.85498989. 25 00:01:48,850 --> 00:01:52,439 Well, that's closer to 1 than 0.5, so 26 00:01:52,439 --> 00:01:56,513 the correlation in our dataset between height and 27 00:01:56,513 --> 00:02:01,076 weight is 0.85, which is a positive correlation. 28 00:02:01,076 --> 00:02:06,650 It's not super strong or perfect, but it's certainly not weak. 29 00:02:07,690 --> 00:02:09,090 That makes sense for height and weight.