1 00:00:00,740 --> 00:00:04,820 Sometimes, it makes sense to view a row in the context of a subset of results from 2 00:00:04,820 --> 00:00:06,800 the query instead of all of the results. 3 00:00:08,070 --> 00:00:10,840 For instance, with our chess match dataset, 4 00:00:10,840 --> 00:00:14,020 you might want to know how one player's performance compares to other players 5 00:00:14,020 --> 00:00:17,430 on the same team rather than to the entire field. 6 00:00:17,430 --> 00:00:21,060 A partition clause divides the result set into categories 7 00:00:21,060 --> 00:00:24,770 according to conditions that you provide within the parenthesis in the over clause. 8 00:00:25,900 --> 00:00:30,350 These groupings are like panes within the window of data that you are querying over. 9 00:00:30,350 --> 00:00:33,230 Window functions on each row have access to the rows pane. 10 00:00:34,450 --> 00:00:37,670 Let's use partitions to find out which team had the highest winning rate 11 00:00:37,670 --> 00:00:39,100 in the tournament. 12 00:00:39,100 --> 00:00:42,790 We already have a field for the players winning percentage, which is calculated 13 00:00:42,790 --> 00:00:46,390 as the sum of checkmate divided by the count of match_id. 14 00:00:46,390 --> 00:00:50,668 For the window function version of this field, we'll start with the numerator, 15 00:00:50,668 --> 00:00:51,724 the number of wins. 16 00:00:51,724 --> 00:00:54,072 We'll use sum of sum of checkmate, 17 00:00:54,072 --> 00:00:58,619 just like we did to get the total number of matches in the tournament. 18 00:00:58,619 --> 00:01:01,030 We'll add over to make it a window function. 19 00:01:02,520 --> 00:01:06,867 Inside the parenthesis, will add partition by to start the partition clause. 20 00:01:08,861 --> 00:01:13,061 Here, we'll put the conditions that the function should use to separate the window 21 00:01:13,061 --> 00:01:14,600 into panes. 22 00:01:14,600 --> 00:01:18,120 We want the numerator to be the total matches won by the player's team. 23 00:01:18,120 --> 00:01:20,530 So we'll partition by player_team. 24 00:01:20,530 --> 00:01:24,795 We'll do another window function for the denominator, using sum, 25 00:01:24,795 --> 00:01:26,422 of count, of match_id. 26 00:01:28,260 --> 00:01:30,697 Again, over partition by player_team. 27 00:01:41,777 --> 00:01:45,435 We can order by this field to make it easier to see which teams win rate is 28 00:01:45,435 --> 00:01:46,470 highest. 29 00:01:46,470 --> 00:01:49,365 We do this by putting an order by clause in the overall query. 30 00:01:51,949 --> 00:01:55,584 Instead of typing the entire window function again to tell the order by 31 00:01:55,584 --> 00:01:57,679 clause which field to use for ordering, 32 00:01:57,679 --> 00:02:01,690 we can use the number of the field in the overall query as a shortcut. 33 00:02:01,690 --> 00:02:05,660 In our overall query, the window function we just created is the 12th field. 34 00:02:05,660 --> 00:02:08,020 I'm also going add desc 35 00:02:08,020 --> 00:02:12,170 to specify that it should order by that field in a descending fashion. 36 00:02:12,170 --> 00:02:15,190 That way, the team with the highest win rate will be at the top and 37 00:02:15,190 --> 00:02:16,930 it will be easier for us to see. 38 00:02:16,930 --> 00:02:17,520 Let's run that. 39 00:02:20,766 --> 00:02:24,490 Now each row shows the win rate of that player's team. 40 00:02:24,490 --> 00:02:28,720 We can see that the blue team had the most successful tournament overall. 41 00:02:28,720 --> 00:02:31,950 The partitioning conditions follow the same rules as the expressions 42 00:02:31,950 --> 00:02:33,880 that we passed to the window function. 43 00:02:33,880 --> 00:02:36,880 They must return one value per row of the result set. 44 00:02:36,880 --> 00:02:41,160 Here, we can partition by player_team because it's in the grouping statement. 45 00:02:41,160 --> 00:02:43,800 But we couldn't partition by checkmate, for example. 46 00:02:43,800 --> 00:02:47,760 We could partition by sum of checkmate, so that all players with the same number of 47 00:02:47,760 --> 00:02:50,070 wins would be grouped together, in one partition. 48 00:02:51,090 --> 00:02:54,310 This would be helpful if we were trying to figure out a tiebreaker between players, 49 00:02:54,310 --> 00:02:56,730 who all won the same number of matches. 50 00:02:56,730 --> 00:02:59,260 For example, you could break the tie by 51 00:02:59,260 --> 00:03:02,510 identifying the player that had the lowest ranking coming into the tournament, 52 00:03:02,510 --> 00:03:05,670 because that player performed better relative to their skill level. 53 00:03:05,670 --> 00:03:09,420 You could say they overcame the greatest odds and should win in the case of a tie. 54 00:03:09,420 --> 00:03:10,920 Let's look at the minimum ranking for 55 00:03:10,920 --> 00:03:13,640 players who won the same number of matches. 56 00:03:13,640 --> 00:03:18,899 We'll do that by creating another window function, with min of player_ranking. 57 00:03:21,021 --> 00:03:25,323 We can pass player_ranking directly to the min function because it's in the group 58 00:03:25,323 --> 00:03:25,945 by clause. 59 00:03:25,945 --> 00:03:29,635 We'll add over to make it a window function. 60 00:03:29,635 --> 00:03:32,324 And in this case, in our partition clause, 61 00:03:32,324 --> 00:03:36,420 we're going to partition by the sum of checkmate. 62 00:03:36,420 --> 00:03:40,008 And that's going to group together all of the players that have the same number of 63 00:03:40,008 --> 00:03:47,194 wins We'll order the results by the number of matches won so 64 00:03:47,194 --> 00:03:51,210 that we have all the players who tied for first at the top. 65 00:03:51,210 --> 00:03:57,200 To do that, we'll replace the 12 from last time by sum of checkmate. 66 00:03:57,200 --> 00:04:00,010 We also have sum of checkmate in our original query. 67 00:04:00,010 --> 00:04:04,070 So if we wanted, we could go up here and find that it's the 5th field and 68 00:04:04,070 --> 00:04:05,670 use 5 instead. 69 00:04:05,670 --> 00:04:10,820 If we look at players who won all of their matches, here it's the top four rows, 70 00:04:10,820 --> 00:04:14,130 we find that Robert had the lowest ranking. 71 00:04:14,130 --> 00:04:18,290 And if we look at the results of our window function, we see that for 72 00:04:18,290 --> 00:04:21,610 all the players who won all of their matches, the tiebreaker number, 73 00:04:21,610 --> 00:04:27,010 which is the minimum ranking for all those players, matches Robert's ranking of 2209. 74 00:04:27,010 --> 00:04:30,820 If you want to partition by multiple items, separate them with commas. 75 00:04:30,820 --> 00:04:33,901 Let's find the tiebreaker within each team. 76 00:04:33,901 --> 00:04:36,525 We'll do that by duplicating the tiebreaker field and 77 00:04:36,525 --> 00:04:38,590 calling it tiebreaker by team. 78 00:04:38,590 --> 00:04:42,922 In the partition clause, we'll add a partition by player_team, 79 00:04:42,922 --> 00:04:46,041 right after the partition by the number of wins. 80 00:04:48,401 --> 00:04:52,540 Let's take another look at the players who won all of their matches. 81 00:04:52,540 --> 00:04:54,940 Again, these top four rows. 82 00:04:54,940 --> 00:04:59,270 Two of them, Emily and Kate, were both on the yellow team. 83 00:04:59,270 --> 00:05:00,730 If we look at their rankings, 84 00:05:00,730 --> 00:05:04,330 we can see that Emily went into the tournament with the lower player ranking. 85 00:05:04,330 --> 00:05:09,520 And if we look over at the results of our window function, for the team tiebreaker. 86 00:05:09,520 --> 00:05:12,025 If we look at the results for those two rows, for 87 00:05:12,025 --> 00:05:14,955 the two players who won all of their matches on the yellow team. 88 00:05:14,955 --> 00:05:17,785 The tiebreaker matches Emily's ranking. 89 00:05:17,785 --> 00:05:20,925 Partitions are similar to the group by clause that we use with normal 90 00:05:20,925 --> 00:05:22,445 aggregating functions. 91 00:05:22,445 --> 00:05:24,935 When we group rows to use a normal function, 92 00:05:24,935 --> 00:05:28,445 the rows that we're calculating over are all collapsed into one record. 93 00:05:28,445 --> 00:05:30,285 That happens in the original report, 94 00:05:30,285 --> 00:05:32,960 where all of the matches are collapsed by player. 95 00:05:32,960 --> 00:05:36,500 Window functions, on the other hand, are different and more powerful. 96 00:05:36,500 --> 00:05:39,220 They leave the underlying records from the query intact. 97 00:05:39,220 --> 00:05:42,150 This means that we can use different partitions to compare different groups of 98 00:05:42,150 --> 00:05:43,940 data in the same report. 99 00:05:43,940 --> 00:05:47,905 In our example, we created window functions with three different partitions 100 00:05:47,905 --> 00:05:50,235 and included them all in the same query. 101 00:05:50,235 --> 00:05:51,675 Without window functions, 102 00:05:51,675 --> 00:05:55,295 we need multiple queries, each with its own group by clause. 103 00:05:55,295 --> 00:05:58,765 In other words, window functions let us write one report instead of three.