1 00:00:00,000 --> 00:00:02,542 Our end goal is to sort a list of names. 2 00:00:02,542 --> 00:00:05,920 But comparing numbers is a little easier than comparing strings. 3 00:00:05,920 --> 00:00:09,238 So we're going to start by sorting a list of numbers. 4 00:00:09,238 --> 00:00:13,972 I'll show you how to modify our examples to sort strings at the end of the course. 5 00:00:13,972 --> 00:00:17,664 To help make clear the importance of choosing a good sorting algorithm, 6 00:00:17,664 --> 00:00:19,548 we're going to start with a bad one. 7 00:00:19,548 --> 00:00:21,587 It's called bogo_sort. 8 00:00:21,587 --> 00:00:26,437 Basically, bogo_sort just randomizes the order of the list repeatedly 9 00:00:26,437 --> 00:00:27,852 until it's sorted. 10 00:00:27,852 --> 00:00:31,769 Here's a Python code file where we're going to implement bogo_sort. 11 00:00:31,769 --> 00:00:34,319 It's not important to understand this code here at the top. 12 00:00:34,319 --> 00:00:37,990 Although, we'll have info on it in the teacher's notes, if you really want it. 13 00:00:37,990 --> 00:00:41,710 All you need to know is that it takes the name of a file that we pass on the command 14 00:00:41,710 --> 00:00:44,700 line, loads it, and returns a Python list. 15 00:00:44,700 --> 00:00:47,620 Which is just like an array in other languages, 16 00:00:47,620 --> 00:00:49,750 containing all the numbers that it read from the file. 17 00:00:50,920 --> 00:00:54,880 Let me have the program print out the list of numbers it loads so you can see it. 18 00:00:54,880 --> 00:00:59,621 We'll call the print method, and we'll pass it the list of numbers. 19 00:00:59,621 --> 00:01:05,146 Save that, let's run it real quick with, python bogo_sort.py. 20 00:01:07,127 --> 00:01:07,655 Whoops. 21 00:01:07,655 --> 00:01:12,521 And we need to provide it the name of the file here on the command line that we're 22 00:01:12,521 --> 00:01:13,344 gonna load. 23 00:01:13,344 --> 00:01:16,011 So it's in the numbers directory. 24 00:01:16,011 --> 00:01:22,011 A slash separates the directory name from the file name, 5.txt. 25 00:01:22,011 --> 00:01:25,427 And there's our list of numbers that was loaded from the file. 26 00:01:25,427 --> 00:01:29,198 Okay, let me delete that print statement and then we'll move on. 27 00:01:29,198 --> 00:01:33,658 bogo_sort just randomly rearranges the list of values over and over. 28 00:01:33,658 --> 00:01:38,340 So the first thing we're going to need is a function to detect when the list is 29 00:01:38,340 --> 00:01:38,932 sorted. 30 00:01:38,932 --> 00:01:43,487 We'll write an is_sorted function that takes a list of values as a parameter. 31 00:01:46,133 --> 00:01:50,210 It will return true if the list passed in as sorted or false if it isn't. 32 00:01:51,670 --> 00:01:55,020 We'll loop through the numeric index of each value in the list, 33 00:01:55,020 --> 00:01:57,590 from 0 to 1 less than the length of the list. 34 00:01:57,590 --> 00:02:00,710 Like many languages, Python list indexes begin at 0, 35 00:02:00,710 --> 00:02:06,350 so a list with a length of 5 has indexes going from 0 through 4. 36 00:02:06,350 --> 00:02:07,630 If the list is sorted, 37 00:02:07,630 --> 00:02:11,250 then every value in it will be less than the one that comes after it. 38 00:02:11,250 --> 00:02:16,360 So we test to see whether the current item is greater than the one that follows it. 39 00:02:16,360 --> 00:02:21,190 If it is, it means the whole list is not sorted so we can return false. 40 00:02:21,190 --> 00:02:22,270 If we get down here, 41 00:02:22,270 --> 00:02:26,038 it means the loop completed without finding any unsorted values. 42 00:02:26,038 --> 00:02:29,050 Python uses whitespace to mark code blocks, so 43 00:02:29,050 --> 00:02:32,330 unindenting the code like this marks the end of the loop. 44 00:02:32,330 --> 00:02:35,060 Since all the values are sorted, we can return True. 45 00:02:36,540 --> 00:02:40,211 Now we need to write the function that will actually do the so-called sorting. 46 00:02:40,211 --> 00:02:44,695 The bogo_sort function will also take a list of values it's working with as 47 00:02:44,695 --> 00:02:45,546 a parameter. 48 00:02:45,546 --> 00:02:49,627 We'll call our is_sorted function to test whether the list is sorted. 49 00:02:49,627 --> 00:02:53,016 We'll keep looping until is_sorted returns true. 50 00:02:53,016 --> 00:02:57,368 Python has a ready-made function that randomizes the order of elements in 51 00:02:57,368 --> 00:02:57,997 the list. 52 00:02:57,997 --> 00:03:01,600 Since the list isn't sorted, we'll call that function here. 53 00:03:01,600 --> 00:03:04,750 And since this is inside the loop, it'll be randomized over and 54 00:03:04,750 --> 00:03:07,400 over until our is_sorted function returns True. 55 00:03:08,420 --> 00:03:12,885 If the loop exits, it means is_sorted returned True and the list is sorted. 56 00:03:12,885 --> 00:03:15,990 So we can now return the sorted list. 57 00:03:15,990 --> 00:03:18,837 Finally, we need to call our bogo_sort function, 58 00:03:18,837 --> 00:03:22,990 pass it the list we loaded from the file and print the sorted list it returns. 59 00:03:24,420 --> 00:03:26,610 Okay, let's save this and try running it. 60 00:03:26,610 --> 00:03:31,401 Do so with, python, the name of the script, bogus_sort.py, and 61 00:03:31,401 --> 00:03:36,785 the name of the file we're going to run it on, numbers directory, 5.txt. 62 00:03:39,238 --> 00:03:42,490 It looks like it's sorting our list successfully. 63 00:03:42,490 --> 00:03:44,320 But how efficient is this? 64 00:03:44,320 --> 00:03:48,360 Let's add some code to track the number of times it attempts to sort the list. 65 00:03:48,360 --> 00:03:50,420 Up here at the top of the bogo_sort function, 66 00:03:50,420 --> 00:03:53,530 we'll add a variable to track the number of attempts it's made. 67 00:03:53,530 --> 00:03:56,400 We'll name it attempts and we'll set its initial value to 0, 68 00:03:56,400 --> 00:03:58,310 since we haven't made any attempts yet. 69 00:04:00,040 --> 00:04:03,420 With each pass through the loop, we'll print current number of attempts. 70 00:04:05,030 --> 00:04:08,773 And then here at the loop, after attempting to shuffle the values, 71 00:04:08,773 --> 00:04:11,149 we'll add one to the account of attempts. 72 00:04:14,129 --> 00:04:18,717 Let's save this and let's try running it again a couple times. 73 00:04:18,717 --> 00:04:19,677 In the console, 74 00:04:19,677 --> 00:04:24,030 I can just press the up arrow to bring up the previous commands and re-run. 75 00:04:24,030 --> 00:04:29,168 So it looks like this first run to sort this five element list took 363 attempts. 76 00:04:29,168 --> 00:04:30,070 Let's try it again. 77 00:04:31,560 --> 00:04:34,210 This time it only took 91 attempts. 78 00:04:34,210 --> 00:04:37,335 We're simply randomizing the list with each attempt so 79 00:04:37,335 --> 00:04:42,220 each run of the program takes a random number of attempts. 80 00:04:42,220 --> 00:04:46,878 Now let's try the same program with a larger number of items, python 81 00:04:46,878 --> 00:04:51,550 bogo_sort numbers. 82 00:04:51,550 --> 00:04:54,892 I have a list of 8 items set up here in this other file. 83 00:04:57,339 --> 00:05:01,703 This time it takes 11,000 attempts. 84 00:05:01,703 --> 00:05:03,797 Only 487 this time. 85 00:05:06,252 --> 00:05:07,918 And this time 13,000. 86 00:05:07,918 --> 00:05:11,550 You can see that the trend is increasing steadily. 87 00:05:11,550 --> 00:05:14,710 The problem with bogo_sort is that it doesn't make any progress 88 00:05:14,710 --> 00:05:16,610 toward a solution with each pass. 89 00:05:16,610 --> 00:05:20,162 It could generate a list where just one value was out of order, but 90 00:05:20,162 --> 00:05:23,972 then on the next attempt it could generate a list where all the elements 91 00:05:23,972 --> 00:05:25,276 are out of order again. 92 00:05:25,276 --> 00:05:28,281 Stumbling on a solution is literally a matter of luck. 93 00:05:28,281 --> 00:05:32,389 And for lists with more than a few items, it might never happen. 94 00:05:32,389 --> 00:05:35,220 Up next, we'll look at selection_sort. 95 00:05:35,220 --> 00:05:38,985 It's a sorting algorithm that's still slow, but it's better than bogo_sort.