Heads up! To view this whole video, sign in with your Courses Plus account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
The OVER keyword lets you transform an aggregate function into a window function. This lets an aggregate function see, and access, all of the results of a query.
Definitions
OVER — keyword used to create window functions. Follow with empty parentheses to access the entire result set.
Code Example
SELECT PLAYER_NAME,
PLAYER_RANKING,
PLAYER_TEAM,
COUNT(MATCH_ID) as MATCHES_PLAYED,
SUM(CHECKMATE) as MATCHES_WON,
SUM(CHECKMATE)/COUNT(MATCH_ID) as WIN_RATE,
AVG(PLAYER_RANKING) OVER () as AVG_RANKING,
PLAYER_RANKING - AVG(PLAYER_RANKING) OVER () as DIFF_FROM_AVERAGE,
SUM(SUM(CHECKMATE)) OVER () as TOTAL_MATCHES,
SUM(MAX(CHECKMATE)) OVER () as PLAYERS_WITH_WINS,
MAX(SUM(CHECKMATE)) OVER () as HIGHEST_WINS
FROM treehouse.chess_data_matches
GROUP BY PLAYER_NAME, PLAYER_RANKING,PLAYER_TEAM
When we use aggregate functions like sum,
count, or
0:00
average in a query,
they calculate a result using
0:04
the records that are associated with
their row by the GROUP BY statement.
0:07
They can't access the values of any
records outside of their grouping.
0:11
So the values that they return are placed
in the context of the overall results
0:15
with the over keyword
we can open a Window so
0:19
that these aggregate functions can see and
access all the results of the query.
0:22
This transforms a regular aggregate
function into a Window function.
0:27
Using just the over keyword,
0:31
we can add information about the entire
tournament to our existing report.
0:33
To answer our first question,
0:37
how does each player's ranking
compare to the overall field.
0:39
We calculate the average ranking of
all the players in the tournament.
0:42
To do this,
we'll add a new field to the report.
0:45
Let's start out by adding
the AVG(PLAYER_RANKING) AS AVG_RANKING.
0:49
This gives us a different number for
1:03
each player because the report
is grouped by player.
1:05
The average function returns
the player's own ranking on each row.
1:08
That's not what we want,
we want the average across all the rows.
1:12
Let's have the over keyword to give
the function access to all the rows
1:15
then we'll add empty parentheses.
1:20
The parentheses create a space.
1:22
To give the window function
more instructions for
1:24
how to aggregate which we'll cover later.
1:26
For now, we'll leave those options blank.
1:29
So that we're querying
over the whole result set.
1:31
Now, when we run the report, it's easy
to see whether each player is above or
1:34
below the average ranking
which is 1,738.7.
1:39
We can see this directly by subtracting
that average from the player's ranking.
1:42
Let's make a new field to do that.
1:47
We'll start out with
the player's own ranking.
1:51
Player ranking.
1:54
Then we'll subtract the window
function that we just created.
1:57
Average of player ranking
over the entire result set.
2:01
This tells us how far above or
below average each player's ranking is.
2:14
It's important to know that unlike regular
aggregating functions, window functions do
2:21
not have access to all of the records in
the tables that you're selecting from.
2:25
Instead, they have access to
the results of the grouping
2:29
that you've specified in
the queries group by clause.
2:32
This means that if your
query has a group by clause.
2:35
You can only use the grouping fields
directly while any other fields
2:39
have to be aggregated inside the window
function aggregation for example, for
2:42
our second question how many total
matches were played in the tournament.
2:47
We want to add up the check me
field over the entire result set
2:51
If we try to do that directly using some
of Checkmate over empty parentheses,
2:54
it will not run because Checkmate
is not a grouping field.
2:59
It's helpful to think of
the window function as querying
3:03
over the results of the existing query.
3:06
In other words the window only looks
on to the rows returned by the query
3:08
when the function looks at each row of
the result set it has to get one value for
3:12
that row.
3:16
To use and its calculation.
3:17
This example doesn't work because
the query results don't include
3:19
the check mate field.
3:22
Each row represents
a collection of records with
3:23
multiple values for check mate.
3:26
If we want to add up all of those values,
we have to first sum them up for
3:28
the row using sum of Checkmate.
3:32
And then we can pass that
aggregation to the window function.
3:34
We can do this by adding a sum call
3:37
inside of the sum that belongs
to the window function.
3:41
Overall, we get sum of sum of
checkmate over empty parentheses.
3:44
By summing up the sums of
all the players matches one,
3:49
we find that there were 90 total
matches in the tournament.
3:52
You can make some match inner and
outer aggregations.
3:55
For example If we pass (MAX(CHECKMATE)
to the window function instead of sum,
3:59
we'll get the number PLAYERS_WITH_WINS
4:03
If we reverse that and
take the MAX(SUM(CHECKMATE),
4:19
we'll get the highest number
of wins by any player.
4:22
Combining different aggregations and
window functions multiplies the power of
4:28
the original aggregation, so we can answer
more complex questions about the dataset.
4:31
You need to sign up for Treehouse in order to download course files.
Sign up