1 00:00:00,150 --> 00:00:02,760 In the last section, we talked about filtering rows, 2 00:00:02,760 --> 00:00:05,740 where we only show ties that match a certain condition. 3 00:00:05,740 --> 00:00:10,010 When grouping rows, it's about combining ties that match a condition. 4 00:00:10,010 --> 00:00:14,190 To help illustrate the subtle difference, let's look at some graphics. 5 00:00:14,190 --> 00:00:18,700 Filter is a type of selection that refers to restricting the result set 6 00:00:18,700 --> 00:00:22,450 to contain only those elements that satisfies a specified condition. 7 00:00:24,040 --> 00:00:28,480 Grouping refers to the operation of putting data into groups so 8 00:00:28,480 --> 00:00:31,030 that the elements in each group share a common attribute. 9 00:00:31,030 --> 00:00:33,690 The following illustration shows 10 00:00:33,690 --> 00:00:36,710 the results of grouping a sequence of characters. 11 00:00:36,710 --> 00:00:39,010 The key for each group is the character. 12 00:00:39,010 --> 00:00:42,540 Now let's move on to writing functions for grouping. 13 00:00:42,540 --> 00:00:49,070 For example, we can look at how do prices of Gucci ties compared to J Crew prices. 14 00:00:49,070 --> 00:00:52,270 While we look, we can look at maximum prices for each brand. 15 00:00:52,270 --> 00:00:57,120 So we filter by the brand, and then we find the maximum and the averages. 16 00:00:58,580 --> 00:00:59,843 Let's create a new file again. 17 00:00:59,843 --> 00:01:05,905 [BLANK_AUDIO] 18 00:01:05,905 --> 00:01:07,014 And from- 19 00:01:07,014 --> 00:01:12,345 [BLANK_AUDIO] 20 00:01:12,345 --> 00:01:14,969 Let's create a variable. 21 00:01:14,969 --> 00:01:16,429 [BLANK_AUDIO] 22 00:01:16,429 --> 00:01:17,453 Gucci ties. 23 00:01:17,453 --> 00:01:27,453 [BLANK_AUDIO] 24 00:01:55,780 --> 00:02:00,678 Okay now we have filtered data samples of each brand that we're interested 25 00:02:00,678 --> 00:02:04,654 in looking at and now we can find the maximum values for each. 26 00:02:04,654 --> 00:02:14,654 [BLANK_AUDIO] 27 00:02:22,080 --> 00:02:26,923 Remember how we have the, this find_max_min from before. 28 00:02:26,923 --> 00:02:28,400 We can reuse that as well. 29 00:02:28,400 --> 00:02:35,527 [BLANK_AUDIO] 30 00:02:35,527 --> 00:02:41,159 Actually we're going to leave that blank because the default is max. 31 00:02:41,159 --> 00:02:45,689 And let's print out this stream 32 00:02:45,689 --> 00:02:50,226 message equals maximum brand. 33 00:02:50,226 --> 00:02:52,764 [BLANK_AUDIO] 34 00:02:52,764 --> 00:02:59,676 Tie price is .format and then, let's pass in the arguments. 35 00:02:59,676 --> 00:03:09,676 [BLANK_AUDIO] 36 00:03:12,347 --> 00:03:14,182 So we wanna press, pass in the brand. 37 00:03:14,182 --> 00:03:16,230 Which would be- 38 00:03:16,230 --> 00:03:18,736 [BLANK_AUDIO] 39 00:03:18,736 --> 00:03:21,078 Actually let's do it that way. 40 00:03:21,078 --> 00:03:23,078 Let's have a string and 41 00:03:23,078 --> 00:03:28,930 then when we print that's where we can pass in the the values. 42 00:03:28,930 --> 00:03:33,810 So we have message.format, and we're pass, passing a brand. 43 00:03:33,810 --> 00:03:39,832 Here we're interested in Gucci first and 44 00:03:39,832 --> 00:03:42,594 also max_gucci. 45 00:03:42,594 --> 00:03:43,737 Let's see if that works. 46 00:03:43,737 --> 00:03:53,737 [BLANK_AUDIO] 47 00:03:55,929 --> 00:04:01,155 Okay let's, let's do this just to make sure. 48 00:04:01,155 --> 00:04:02,051 Hm. 49 00:04:02,051 --> 00:04:07,362 [BLANK_AUDIO] 50 00:04:07,362 --> 00:04:11,490 Oh great okay, so this error happened because 51 00:04:11,490 --> 00:04:16,393 price label is actually in the header so let's remove that. 52 00:04:16,393 --> 00:04:22,192 [BLANK_AUDIO] 53 00:04:22,192 --> 00:04:27,525 And we can still use max min. 54 00:04:27,525 --> 00:04:29,189 And let's see. 55 00:04:29,189 --> 00:04:31,596 [BLANK_AUDIO] 56 00:04:31,596 --> 00:04:37,249 Oops, let's just use this. 57 00:04:37,249 --> 00:04:41,700 [BLANK_AUDIO] 58 00:04:41,700 --> 00:04:42,447 Oh no. 59 00:04:42,447 --> 00:04:45,391 So I guess J., Crew does not have a space. 60 00:04:45,391 --> 00:04:46,841 So if we do that. 61 00:04:46,841 --> 00:04:48,531 [BLANK_AUDIO] 62 00:04:48,531 --> 00:04:56,110 Remove this piece clear console so we can more easily read it alright so 63 00:04:56,110 --> 00:05:01,761 maximum Gucci tie price is \$545 oh my God that's 64 00:05:01,761 --> 00:05:06,655 pretty expensive let's try another brand. 65 00:05:06,655 --> 00:05:17,274 [BLANK_AUDIO] 66 00:05:17,274 --> 00:05:19,860 Let's see how J., Crew's ties compare. 67 00:05:19,860 --> 00:05:22,530 And it's a little bit less expensive. 68 00:05:24,010 --> 00:05:26,831 Okay. So now let's try averages because if 69 00:05:26,831 --> 00:05:32,072 the maximum value varies so much, maybe the average also varies a lot. 70 00:05:32,072 --> 00:05:37,160 [BLANK_AUDIO] 71 00:05:37,160 --> 00:05:42,702 So, we have average, Gucci, which is equal to average, 72 00:05:42,702 --> 00:05:47,575 find average is the function name, find average. 73 00:05:55,335 --> 00:06:00,238 Let's look quickly again at the functions. 74 00:06:00,238 --> 00:06:04,894 It was 75 00:06:06,770 --> 00:06:11,120 just the data sample and whether or not there are headers. 76 00:06:11,120 --> 00:06:15,701 Let's click that so we want to say, 77 00:06:15,701 --> 00:06:18,917 true there are headers. 78 00:06:18,917 --> 00:06:21,327 I wanna print the message. 79 00:06:21,327 --> 00:06:28,913 [BLANK_AUDIO] 80 00:06:28,913 --> 00:06:35,504 And let's actually change this so that it's three arguments passed in. 81 00:06:35,504 --> 00:06:43,113 [BLANK_AUDIO] 82 00:06:43,113 --> 00:06:48,260 And then here instead of maximum we could say average. 83 00:06:48,260 --> 00:06:53,020 And we could reuse all of that. 84 00:06:53,020 --> 00:06:53,878 Cool, right? 85 00:06:53,878 --> 00:06:56,609 [BLANK_AUDIO] 86 00:06:56,609 --> 00:06:59,041 Average price is \$165. 87 00:06:59,041 --> 00:07:02,411 Let's see what the J Crew average price may be. 88 00:07:02,411 --> 00:07:12,594 [BLANK_AUDIO] 89 00:07:12,594 --> 00:07:14,640 There we go. 90 00:07:14,640 --> 00:07:19,640 And here, we wanna use jcrew_ties, and 91 00:07:19,640 --> 00:07:27,780 here we wanna in the message print out J.Crew and use average jcrew price. 92 00:07:29,680 --> 00:07:32,972 Okay. [BLANK_AUDIO] 93 00:07:32,972 --> 00:07:33,497 Oh, so 94 00:07:33,497 --> 00:07:39,193 the average price of a J Crew tie is lower than the average price of a Gucci tie. 95 00:07:39,193 --> 00:07:43,370 If you continue doing this, and also compare the prints, 96 00:07:43,370 --> 00:07:45,945 and and other information. 97 00:07:45,945 --> 00:07:48,145 So, let's try one more exercise. 98 00:07:48,145 --> 00:07:52,585 Let's try printing out the, the print. 99 00:07:52,585 --> 00:07:55,435 So, paisley, stripe, solid, or print. 100 00:07:55,435 --> 00:07:56,775 Let's see what that looks like. 101 00:07:56,775 --> 00:08:06,775 [BLANK_AUDIO] 102 00:08:50,047 --> 00:08:52,910 And lastly we wanna look at the solid ties. 103 00:08:53,910 --> 00:08:58,095 So we have four different patterns to look at. 104 00:08:58,095 --> 00:09:08,095 [BLANK_AUDIO] 105 00:09:19,087 --> 00:09:19,621 Oops. 106 00:09:19,621 --> 00:09:21,327 I have a typo here. 107 00:09:21,327 --> 00:09:26,258 [BLANK_AUDIO] 108 00:09:26,258 --> 00:09:30,056 All right, so let's see what happens when we print out that. 109 00:09:30,056 --> 00:09:35,826 [BLANK_AUDIO] 110 00:09:35,826 --> 00:09:40,315 So we didn't finish the parenthesis it needs to match. 111 00:09:42,485 --> 00:09:43,735 Okay so there we go. 112 00:09:43,735 --> 00:09:44,449 It worked. 113 00:09:44,449 --> 00:09:49,740 Now let's actually print out all four of them. 114 00:09:49,740 --> 00:09:59,740 [BLANK_AUDIO] 115 00:10:03,832 --> 00:10:08,653 Instead of the message from before we wanna use tab spacing. 116 00:10:08,653 --> 00:10:18,653 [BLANK_AUDIO] 117 00:10:47,776 --> 00:10:52,972 Whoops we don't want a piece 118 00:10:52,972 --> 00:10:58,384 of that we want to copy this and 119 00:10:58,384 --> 00:11:02,724 the same order prints. 120 00:11:02,724 --> 00:11:12,724 [BLANK_AUDIO] 121 00:11:18,649 --> 00:11:21,614 Okay now, let's print out this and see what it looks like. 122 00:11:21,614 --> 00:11:31,614 [BLANK_AUDIO] 123 00:11:35,994 --> 00:11:42,799 All right we only want two arguments. 124 00:11:42,799 --> 00:11:45,615 We only want two arguments here so we actually want to remove this. 125 00:11:45,615 --> 00:11:49,756 [BLANK_AUDIO] 126 00:11:49,756 --> 00:11:51,548 And we're gonna print it again. 127 00:11:51,548 --> 00:11:52,687 There we go. So it's, 128 00:11:52,687 --> 00:11:56,519 we have the tab spacing making the alignment look nicer. 129 00:11:56,519 --> 00:12:01,235 From what we can see from this is that print ties are not cheaper 130 00:12:01,235 --> 00:12:02,600 than solid ties.