1 00:00:00,650 --> 00:00:04,370 We've counted rows but what about doing something more useful like 2 00:00:04,370 --> 00:00:07,650 totaling up all the values in a column? 3 00:00:07,650 --> 00:00:10,920 We're still working with the e-commerce site database. 4 00:00:10,920 --> 00:00:15,470 The CEO wants to reward the most loyal customer with a gift card. 5 00:00:15,470 --> 00:00:19,120 Let's find out which customer has spent the most money on our site. 6 00:00:21,360 --> 00:00:24,000 We're going to use the SUM function. 7 00:00:24,000 --> 00:00:26,860 The SUM function sums up any numeric column. 8 00:00:27,890 --> 00:00:32,020 It can be used by conjunction with the GROUP BY keywords to calculate the sum 9 00:00:32,020 --> 00:00:34,030 total of a column. 10 00:00:34,030 --> 00:00:36,580 When you use the SUM with a GROUP BY keywords, 11 00:00:36,580 --> 00:00:41,270 you can generate total values of differently grouped rows. 12 00:00:41,270 --> 00:00:44,370 For example you could find out how many items 13 00:00:44,370 --> 00:00:47,550 you have in stock within each category in your store. 14 00:00:48,710 --> 00:00:51,060 Let's take a look at the orders table. 15 00:00:52,090 --> 00:00:57,670 The main columns we'll be looking at are the user_id and the cost columns. 16 00:00:57,670 --> 00:01:02,160 The cost column is the price of the item at the time of sale. 17 00:01:02,160 --> 00:01:07,793 Let's use sum to find the total of all values in the cost column. 18 00:01:11,426 --> 00:01:16,690 Now this is the total revenue that the online store has generated. 19 00:01:16,690 --> 00:01:20,600 This string of numbers after the decimal point is there because 20 00:01:20,600 --> 00:01:25,330 of how numbers with fractions are handled by computers. 21 00:01:25,330 --> 00:01:29,510 Ignore this for now, but we'll talk about how to clean this up later on. 22 00:01:29,510 --> 00:01:34,650 The online store has generated over nine thousand dollars in revenue. 23 00:01:34,650 --> 00:01:38,690 How do you think we can get the total per customer? 24 00:01:38,690 --> 00:01:41,724 We can group by the user ID. 25 00:01:43,569 --> 00:01:48,995 This will make sure that the aggregate function of sum is one on the results for 26 00:01:48,995 --> 00:01:50,820 each user. 27 00:01:50,820 --> 00:01:53,010 So let's select the user ID. 28 00:01:53,010 --> 00:01:57,400 This is so that we can identify each user. 29 00:01:57,400 --> 00:02:00,400 We could create an alias too to make it more human readable. 30 00:02:03,080 --> 00:02:08,180 When we run this now, we get everybody's total expenditures, but 31 00:02:08,180 --> 00:02:11,090 remember, the boss wants to find the best customer. 32 00:02:12,250 --> 00:02:19,300 To get the person who spent the most, we need to ORDER BY total_spent DESC. 33 00:02:21,220 --> 00:02:25,558 Finally, we want to limit it by 1. 34 00:02:26,719 --> 00:02:29,277 Let's see who this user is. 35 00:02:41,160 --> 00:02:45,070 We can give the details to the CEO to send a personal message. 36 00:02:47,110 --> 00:02:50,420 Upon receiving the information of the most loyal customer, 37 00:02:50,420 --> 00:02:52,750 the CEO wants to be more generous. 38 00:02:52,750 --> 00:02:57,490 The CEO wants to give a gift out to anyone that spent more than $250. 39 00:02:57,490 --> 00:02:59,840 Let's modify our query now. 40 00:03:01,860 --> 00:03:04,360 Okay, this should be simple, right? 41 00:03:04,360 --> 00:03:08,952 After where we have the table name, orders, we can use WHERE 42 00:03:08,952 --> 00:03:14,760 total_spend > 250. 43 00:03:14,760 --> 00:03:21,290 We can remove the limit too now. 44 00:03:23,650 --> 00:03:26,265 When we run this, what result do we get? 45 00:03:26,265 --> 00:03:29,110 Uh-oh, error. 46 00:03:29,110 --> 00:03:30,750 Misuse of aggregate sum. 47 00:03:31,920 --> 00:03:35,750 This is because where conditions filtered the result set or 48 00:03:35,750 --> 00:03:38,360 rows, before grouping things together. 49 00:03:39,590 --> 00:03:44,590 It's pointless to group by a user ID or any column if you're going to 50 00:03:44,590 --> 00:03:49,970 filter out a certain group of users or values based on a condition. 51 00:03:49,970 --> 00:03:53,730 Remember, the sum is working on the aggregated values. 52 00:03:54,940 --> 00:03:59,730 In this case, total_spend is calculated after the grouping. 53 00:04:00,840 --> 00:04:05,360 So, SQL doesn't know what to do with it when you put it in a where clause. 54 00:04:06,410 --> 00:04:07,830 So the question is, 55 00:04:07,830 --> 00:04:13,660 is there a way to apply conditions to values that have been aggregated? 56 00:04:13,660 --> 00:04:15,780 And the answer is yes. 57 00:04:15,780 --> 00:04:17,200 Using the having key word 58 00:04:18,660 --> 00:04:22,700 You use the having keyword after the group by keywords. 59 00:04:22,700 --> 00:04:26,150 Since the checking happens to the aggregate values. 60 00:04:26,150 --> 00:04:28,740 The conditions that you use with the having keyword 61 00:04:28,740 --> 00:04:32,840 are exactly the same as what you'd use with the where keyword. 62 00:04:32,840 --> 00:04:37,750 When we run this now, we now get all the user IDs 63 00:04:37,750 --> 00:04:42,080 of the people who spent over $250. 64 00:04:42,080 --> 00:04:47,400 To recap, we used the SUM aggregate function to total up a numeric column. 65 00:04:47,400 --> 00:04:49,950 In our case, it was the cost column. 66 00:04:49,950 --> 00:04:54,690 We saw that it could be used with GROUP BY to get totals of groups of rows. 67 00:04:54,690 --> 00:05:01,510 And finally, we used the HAVING keyword to use conditions on aggregate functions. 68 00:05:01,510 --> 00:05:05,480 It had to be used after the GROUP BY keywords. 69 00:05:05,480 --> 00:05:08,270 It's also used before ORDER BY keywords. 70 00:05:10,260 --> 00:05:12,800 You're getting to be a SQL powerhouse. 71 00:05:12,800 --> 00:05:14,840 To make sure you've taken it all in, 72 00:05:14,840 --> 00:05:18,600 I've created some more challenging assessments for you to take next. 73 00:05:18,600 --> 00:05:22,820 In the teacher's notes, I've included cheat sheets to past SQL courses for 74 00:05:22,820 --> 00:05:24,620 you to use as reference. 75 00:05:24,620 --> 00:05:25,880 It's not cheating. 76 00:05:25,880 --> 00:05:28,710 In fact, many developers use cheat sheets and 77 00:05:28,710 --> 00:05:31,060 documentation to perform their jobs daily.