1 00:00:00,490 --> 00:00:04,400 The next step in the merge sort algorithm is the divide step, or 2 00:00:04,400 --> 00:00:07,450 rather an implementation of the split function. 3 00:00:07,450 --> 00:00:10,560 Down here we'll call this split like before and 4 00:00:10,560 --> 00:00:15,450 this is going to take a linked_list. 5 00:00:15,450 --> 00:00:20,575 Documenting things is good and we've been doing it so far, so lets add a doc strong. 6 00:00:20,575 --> 00:00:28,422 Divide the unsorted list at midpoint into sublists. 7 00:00:28,422 --> 00:00:32,846 Now, of course, when I say sublists here, I mean sub-linked lists, but 8 00:00:32,846 --> 00:00:34,472 that's a long word to say. 9 00:00:34,472 --> 00:00:38,390 Now, here's where things start to deviate from the previous version. 10 00:00:38,390 --> 00:00:42,560 With the list type, we could rely on the fact that finding the midpoint 11 00:00:42,560 --> 00:00:47,110 using an index and list splicing to split it into two lists would work, 12 00:00:47,110 --> 00:00:49,860 even if an empty list was passed in. 13 00:00:49,860 --> 00:00:53,721 Since we have no automatic behavior like that, we need to account for 14 00:00:53,721 --> 00:00:55,463 this when using a linked list. 15 00:00:55,463 --> 00:01:01,182 So our first condition, is if the linked list is none, or if it's empty, 16 00:01:01,182 --> 00:01:06,715 that is, if head is equal to none, we'll say if linked list == None, 17 00:01:06,715 --> 00:01:13,026 or you can write is there, doesn't matter, or linked_list.head == None. 18 00:01:13,026 --> 00:01:15,648 Well, linked list can be none, for example, 19 00:01:15,648 --> 00:01:18,952 if we call split on a linked list containing a single node. 20 00:01:18,952 --> 00:01:22,912 A split on such a list would mean left would contain the single node 21 00:01:22,912 --> 00:01:25,250 while right would be none. 22 00:01:25,250 --> 00:01:29,838 In either case, we're going to assign the entire list to the left half and 23 00:01:29,838 --> 00:01:31,479 assign none to the right. 24 00:01:31,479 --> 00:01:37,256 So I'll say left_half = linked_list and 25 00:01:37,256 --> 00:01:40,860 then right_half = none. 26 00:01:40,860 --> 00:01:45,570 You could also assign the single element list or none to left and then create 27 00:01:45,570 --> 00:01:50,990 a new empty linked list assigned to the right half, but that's unnecessary work. 28 00:01:50,990 --> 00:01:57,510 So now that we've done this, we can return left_half and right_half. 29 00:01:57,510 --> 00:01:59,150 So that's our first condition. 30 00:01:59,150 --> 00:02:03,880 Let's add an else clause to account for non-empty linked lists. 31 00:02:03,880 --> 00:02:06,500 First, we'll calculate the size of the list. 32 00:02:06,500 --> 00:02:09,320 This is easy because we've done the work already, and 33 00:02:09,320 --> 00:02:12,100 we can just call the size method that we've defined. 34 00:02:12,100 --> 00:02:17,129 We'll say size = linked_list.size. 35 00:02:17,129 --> 00:02:19,643 Using the size we can determine the midpoint. 36 00:02:19,643 --> 00:02:25,209 So mid = size, and we'll use that floor division operator to divide it by 2. 37 00:02:25,209 --> 00:02:29,915 Once we have the midpoint, we need to get the node at that midpoint. 38 00:02:29,915 --> 00:02:32,902 Now, make sure you hit Cmd+S to save here. 39 00:02:32,902 --> 00:02:37,221 And we're going to navigate back to linked_list.py. 40 00:02:37,221 --> 00:02:40,179 In here, we're going to add a convenience 41 00:02:40,179 --> 00:02:44,957 method at the very bottom right before the repr function right here. 42 00:02:44,957 --> 00:02:49,671 And this convenience method is going to return a node at a given index. 43 00:02:49,671 --> 00:02:54,028 So we'll call this node_at_index, and 44 00:02:54,028 --> 00:02:57,663 it's going to take an index value. 45 00:02:57,663 --> 00:03:02,132 This way, instead of having to traverse a list inside of our split function, 46 00:03:02,132 --> 00:03:06,464 we can simply call node at index and pass it the midpoint index we calculated 47 00:03:06,464 --> 00:03:09,873 to give us the node right there so we can perform the split. 48 00:03:09,873 --> 00:03:10,498 Okay, so 49 00:03:10,498 --> 00:03:15,209 this method accepts as an argument the index we want to get the node for. 50 00:03:15,209 --> 00:03:18,536 If this index is 0 then we'll return the head of the list. 51 00:03:18,536 --> 00:03:24,878 So if index == 0 return self.head. 52 00:03:24,878 --> 00:03:29,642 The rest of the implementation involves traversing the linked list, 53 00:03:29,642 --> 00:03:33,043 and counting up to the index as we visit each node. 54 00:03:33,043 --> 00:03:37,840 The rest of the implementation involves traversing the link list and 55 00:03:37,840 --> 00:03:41,073 counting up to the index as we visit each node. 56 00:03:41,073 --> 00:03:44,419 So I'll add an else clause here, and we'll start at the head. 57 00:03:44,419 --> 00:03:47,893 So we'll say current = self.head. 58 00:03:47,893 --> 00:03:52,196 Let's also declare a variable called position to indicate 59 00:03:52,196 --> 00:03:54,690 where we are in the list. 60 00:03:54,690 --> 00:03:57,810 We can use a while loop to walk down the list. 61 00:03:57,810 --> 00:04:02,770 Our condition here is as long as the position is less than the index value. 62 00:04:02,770 --> 00:04:09,790 So I'll say while position is less than index. 63 00:04:09,790 --> 00:04:13,560 Inside the loop, we'll assign the next node to current and 64 00:04:13,560 --> 00:04:15,523 increment the value of position by 1. 65 00:04:15,523 --> 00:04:24,726 So current = current.next_node position += 1. 66 00:04:24,726 --> 00:04:28,490 Once the position value equals the index value, 67 00:04:28,490 --> 00:04:32,180 current refers to the node we're looking for, and we can return it. 68 00:04:32,180 --> 00:04:34,150 We'll say return current. 69 00:04:36,460 --> 00:04:38,730 Let's get rid of all this empty space. 70 00:04:38,730 --> 00:04:39,572 There we go. 71 00:04:39,572 --> 00:04:46,035 Now, back in linked_list merge_sort.py, we can use this method to get at the node 72 00:04:46,035 --> 00:04:52,168 after we've calculated the midpoint to get the node at the midpoint of the list. 73 00:04:52,168 --> 00:04:58,426 So we'll say mid_node = linked_list.node_at_index, 74 00:04:58,426 --> 00:05:03,861 and here I'm gonna do something slightly confusing. 75 00:05:03,861 --> 00:05:05,964 I'm gonna do mid-1. 76 00:05:05,964 --> 00:05:13,314 Remember, we're subtracting 1 here because we used size to calculate the midpoint. 77 00:05:13,314 --> 00:05:15,379 And like the lend function, 78 00:05:15,379 --> 00:05:20,472 size will always return a value greater than the maximum index value. 79 00:05:20,472 --> 00:05:23,155 So think of a linked list with two nodes. 80 00:05:23,155 --> 00:05:25,242 Size would return 2. 81 00:05:25,242 --> 00:05:28,509 The midpoint, though, and the way we're calculating the index, 82 00:05:28,509 --> 00:05:32,131 we always start at 0, which means size is going to be 1 greater than that. 83 00:05:32,131 --> 00:05:35,836 So we're going to deduct 1 from it to get the value we want. 84 00:05:35,836 --> 00:05:38,237 But we're using the floor division operator, so 85 00:05:38,237 --> 00:05:40,877 it's going to round that down even more, no big deal. 86 00:05:40,877 --> 00:05:44,309 With the node at the midpoint, now that we have this midnode, 87 00:05:44,309 --> 00:05:46,097 we can actually split the list. 88 00:05:46,097 --> 00:05:50,685 So first, we're going to assign the entire linked list to 89 00:05:50,685 --> 00:05:55,755 a variable named left_half, so left_half = linked_list. 90 00:05:55,755 --> 00:05:58,812 This seems counterintuitive, but will make sense in a second. 91 00:05:58,812 --> 00:06:04,857 For the right_half we're going to assign a new instance of linked_list, 92 00:06:04,857 --> 00:06:08,200 so right_half = linked_list. 93 00:06:08,200 --> 00:06:10,410 This newly created list is empty, but 94 00:06:10,410 --> 00:06:15,110 we can fix that by assigning the node that comes after the midpoint. 95 00:06:15,110 --> 00:06:19,300 So after the midpoint of the original link list, we can assign the node that comes 96 00:06:19,300 --> 00:06:24,880 after that midpoint node as the head of this newly created right linked list. 97 00:06:24,880 --> 00:06:32,980 So here we'll say right_half.head = mid_node.next_node. 98 00:06:32,980 --> 00:06:37,240 Once we do that, we can assign none to the next node property 99 00:06:37,240 --> 00:06:40,860 on mid_node to effectively sever that connection, and 100 00:06:40,860 --> 00:06:46,480 make what was the mid node now the tail node of the left linked list. 101 00:06:46,480 --> 00:06:50,850 So I'll say mid_node.next_node = None. 102 00:06:51,940 --> 00:06:56,380 If that's confusing, here's a quick visualization of what just happened. 103 00:06:56,380 --> 00:07:00,410 We start off with a single linked list and find the midpoint. 104 00:07:00,410 --> 00:07:04,287 The node that comes after the node at midpoint is assigned to the head of 105 00:07:04,287 --> 00:07:08,625 a newly created linked list, and the connection between the midpoint node and 106 00:07:08,625 --> 00:07:10,780 the one after is removed. 107 00:07:10,780 --> 00:07:15,760 We now have two distinct linked lists, split at the midpoint. 108 00:07:15,760 --> 00:07:19,472 And with that, we can return the two sub-lists. 109 00:07:19,472 --> 00:07:22,720 So we'll return left_half and right_half. 110 00:07:22,720 --> 00:07:25,260 In the next video, let's tackle our merge function.