1 00:00:01,080 --> 00:00:04,713 It is a common pattern in programming to take one data type and 2 00:00:04,713 --> 00:00:06,860 convert it into another data type. 3 00:00:07,920 --> 00:00:13,180 Sets fit into this pattern when it comes to pre-processing data. 4 00:00:13,180 --> 00:00:17,663 For example, let's take a list of random integers 5 00:00:17,663 --> 00:00:22,051 where some of the values are repeated in the list. 6 00:00:28,596 --> 00:00:31,910 So we've got some threes, some ones, some twos, and 7 00:00:31,910 --> 00:00:34,301 they're in no particular order here. 8 00:00:37,410 --> 00:00:42,083 Now, imagine trying to remove the duplicates from this list 9 00:00:42,083 --> 00:00:43,740 without using sets. 10 00:00:44,810 --> 00:00:46,910 We can work through this together, 11 00:00:46,910 --> 00:00:50,700 we can pseudocode the steps required to remove the duplicates. 12 00:01:04,303 --> 00:01:09,727 Pause me, and I'll let you try it first, and then I'll show you how I did it. 13 00:01:13,080 --> 00:01:14,540 How did it go? 14 00:01:14,540 --> 00:01:16,400 Was it challenging? 15 00:01:16,400 --> 00:01:17,410 Here's what I came up with. 16 00:01:19,290 --> 00:01:23,071 My strategy is to iterate through the list of integers and 17 00:01:23,071 --> 00:01:26,237 then collect the unique values into a new list. 18 00:01:26,237 --> 00:01:28,650 First I sort the original list. 19 00:01:28,650 --> 00:01:33,250 It's easier to process data if you sort it first before doing more calculations. 20 00:01:34,820 --> 00:01:37,380 Then I loop through the sorted list. 21 00:01:39,000 --> 00:01:43,371 Since the list is sorted, every time the loop moves to the next element, 22 00:01:43,371 --> 00:01:45,890 I can compare it to the previous element. 23 00:01:47,490 --> 00:01:52,080 On each step of the loop, for the first step I just put that number into a new 24 00:01:52,080 --> 00:01:56,620 list because there's nothing to compare when it's just the first one. 25 00:01:57,940 --> 00:02:03,810 For every next number, I compare it to the last number in the new list. 26 00:02:03,810 --> 00:02:08,109 If the values are equal then I can pass, I don't need to do anything. 27 00:02:08,109 --> 00:02:12,746 If the values are not equal, then I put that number in the new list, and 28 00:02:12,746 --> 00:02:17,760 what happens is that I can end with a list of no duplicate numbers. 29 00:02:17,760 --> 00:02:20,630 You can check my teacher's notes to see my implementation code. 30 00:02:22,130 --> 00:02:24,510 That was quite the mental exercise. 31 00:02:24,510 --> 00:02:29,027 Fortunately, it's not as complicated with sets. 32 00:02:29,027 --> 00:02:32,256 Let's scroll back up to see our list of numbers again. 33 00:02:33,976 --> 00:02:39,185 This time, let's just pass that list into the set constructor function. 34 00:02:40,454 --> 00:02:42,461 And we'll print this out as well. 35 00:02:52,700 --> 00:02:57,997 So for now, I've printed out a set of the numbers, we can see one, 36 00:02:57,997 --> 00:03:02,170 two and three, all of the duplicates are just gone. 37 00:03:03,550 --> 00:03:08,169 I can use the sorted function to convert it back into a list. 38 00:03:08,169 --> 00:03:13,096 So here, Let's say 39 00:03:13,096 --> 00:03:18,140 numbers equals a set of numbers. 40 00:03:18,140 --> 00:03:22,201 First we convert our list into a set, 41 00:03:22,201 --> 00:03:26,393 then numbers equals sorted numbers, 42 00:03:26,393 --> 00:03:30,329 we convert our set back into a list. 43 00:03:30,329 --> 00:03:33,642 Now watch what happens when I print out numbers. 44 00:03:37,860 --> 00:03:41,250 I have a list of one, two, and three. 45 00:03:42,340 --> 00:03:43,393 It's in order. 46 00:03:43,393 --> 00:03:47,431 And it's all of the unique numbers there are no duplicates here. 47 00:03:49,175 --> 00:03:54,820 These two functions, sorted and set can also be chained together 48 00:03:54,820 --> 00:04:00,680 very easily to clean any list of duplicated data in one line of code. 49 00:04:00,680 --> 00:04:02,145 I'll show you what that looks like. 50 00:04:04,880 --> 00:04:09,008 So we can say unique numbers 51 00:04:09,008 --> 00:04:13,675 equals sorted set of numbers, 52 00:04:13,675 --> 00:04:18,708 we can print our unique numbers. 53 00:04:18,708 --> 00:04:24,492 And in the console, I can run my script again to get 1, 2 and 54 00:04:24,492 --> 00:04:29,948 3, my list of unique numbers in a very pythonic manner. 55 00:04:33,170 --> 00:04:39,570 So, we've just learned how to use the set constructor function to make an empty set. 56 00:04:39,570 --> 00:04:41,911 And to make a set from an iterable and 57 00:04:41,911 --> 00:04:46,828 we explored more in-depth about how using sets can simplify data processing. 58 00:04:46,828 --> 00:04:51,250 when duplicates need to be removed from a collection. 59 00:04:51,250 --> 00:04:56,780 Now, we're ready to learn about set specific methods and operations. 60 00:04:56,780 --> 00:05:00,410 In the next video, we will learn the basics of adding and 61 00:05:00,410 --> 00:05:02,430 removing elements from a set.