Preview

Start a free Courses trial
to watch this video

Evaluating the Runtime

Name: Evaluating the Runtime
Uploaded: 2018-09-19T07:37:56-04:00
Duration: 515 s
Description: Now that we have an implementation of merge sort on two data structures how do they compare? Is one "better"?

8:35 with Pasan Premaratne

Now that we have an implementation of merge sort on two data structures how do they compare? Is one "better"?

Teacher's Notes
Questions?
Video Transcript
Downloads
Workspaces

This video doesn't have any notes.

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

First things first. 0:00

Let's test out our code. 0:01

Now we'll keep it simple because writing a robust verified 0:03

function would actually take up this entire video. 0:08

Instead, I'll leave that to you to try as homework. 0:12

At the very end let's create a new linked list. 0:16

Let's add a few nodes to this so l_add. 0:20

I'm gonna copy paste this so 0:23

it makes it easier to me to not to have to retype a bunch. 0:26

So I'll add 10, and then say 2, 44, 15, and something like 200. 0:30

Then we'll go ahead and print l so that we can inspect this list. 0:37

Next let's declare a variable here so we'll call this sorted_linked_list. 0:42

And to this we're going to assign the result of calling merge sort on l and 0:49

then we'll print this. 0:54

So sorted_linked_list. 0:56

Since we've taken care of all the logic, we know that this gets added in as nodes. 0:58

And then, let's see what this looks like. 1:05

Hit save and then bring up the console. 1:08

We're going to type up python linked_list_merge_sort.py and then enter. 1:11

We see that linked list we first created. 1:19

Remember that what we add first, that eventually becomes the tail. 1:21

10 is the tail, 200 is the last one. 1:27

So 200 is the head because I'm calling add. 1:30

It simply adds each one to the head of the list. 1:33

So here we have 10, 2, 44, 15, and 200 in the order we added. 1:36

And then the sorted_linked_list sorts it out. 1:40

So it's 2, 10, 15, 44, and 200. 1:43

Look at that, a sorted linked list. 1:46

Let's visualize this from the top. 1:49

We have a linked list containing 5 nodes, with integers 10, 2, 1:52

44, 15, and 200 as data, respectively. 1:57

Our merge sort function calls split on this list. 2:01

The split function calls size on the list and 2:04

gets back 5 which makes our mid point 2. 2:07

Using this midpoint, we can split the list using the node at index method. 2:09

Remember that when doing this, we deduct one from the value of mid. 2:15

So we're going to split here using an index value of 1. 2:18

Effectively this is the same, since we're starting with an index value of 0, 2:22

1 means we split after node 2. 2:27

We assign the entire list to left half, then create a new list and 2:29

assign that to right half. 2:33

We can assign node 3 at index value 2 as the head of the right list, 2:35

and remove the references between node 2 and node 3. 2:40

So far so good, right? 2:44

Now we're back in the merge_sort function after having called split, and 2:46

we have two linked lists. 2:50

Let's focus on just the left half, because if you go back and 2:52

and look at our code, we're going to call merge_sort on the left linked list again. 2:55

This means the next thing we'll do is run through that split process. 3:01

Since this is a linked list containing two nodes, this means that split 3:04

is going to return a new left and right list, each with one node. 3:08

Again, we're back in the merge sort function, 3:12

which means that we call merge sort on this left list again. 3:14

Since this is a single node link list, on calling merged sort on it, 3:19

we immediately return before we split since we had that stopping condition. 3:23

So we go to the next line of code which is calling 3:28

merge sort on the right list as well. 3:31

But again, we'll get back immediately because we hit that stopping condition. 3:33

Now that we have a left and right that we get back from calling merge sort, 3:36

we can call merge on them. 3:41

Inside the merge function, we start by creating a new linked list and 3:42

attaching a fake head. 3:47

Then we evaluate whether the left or the right head is none. 3:48

Since neither condition is true, 3:52

we go to the final step where we evaluate the data in each node. 3:54

In this case, the data in the right node is less than the left node. 3:58

So we assign the right node as the next node in the merge link list and 4:02

move the right head pointer forward. 4:06

In the merge link list, we move our current pointer forward 4:08

to this new node we've added and that completes one iteration of the loop. 4:12

On the next iteration, 4:17

right head now points to none since that was a single node list. 4:19

And we can assign the rest of the left linked list which 4:23

is effectively the single node over to the merge link list. 4:26

Here we discard the fake head, move the next node up to be the correct head and 4:30

return the newly merged sorted link list. 4:36

Remember that this point because right head and 4:39

left head pointed to none are wild loop terminated. 4:42

So in this way, we recursively split and 4:45

repeatedly merge sublists until we're back with one sorted linked list. 4:47

The merge sort algorithm is a powerful sorting algorithm. 4:52

But ultimately, it doesn't really do anything complicated. 4:56

It just breaks the problem down until it's really simple to solve. 4:59

Remember the technique here which we've talked about before is called 5:04

divide and conquer. 5:07

So I like to think of merge sort in this way. 5:08

There's a teacher at the front of the room and 5:11

she has a bunch of books that she needs to sort into alphabetical order. 5:13

Instead of doing all the work herself, she splits that pile into two and 5:17

hands it to two students at the front. 5:20

Each of those students split it into two more, and 5:23

handed it to the four students seated behind them. 5:26

As each student does this eventually a bunch of single students has two books 5:28

to compare. 5:33

And they can sort it very easily and hand it back to the student who gave it to them 5:34

in front of them who repeats the process backwards. 5:38

So ultimately, it's really simple work. 5:41

It's just efficiently delegated. 5:43

Now back to our implementation here. 5:46

Let's talk about run time. 5:49

So far other than the node swapping we had to do, 5:50

it seems like most of our implementation is the same right? 5:53

In fact it is, including the problems that we run into in the list version as well. 5:57

So in the first implementation of merge sort, 6:01

we thought we had an algorithm that ran in big O of N login, but turns out we didn't. 6:04

Why? 6:10

The Python list slicing operation, if you remember, 6:11

actually takes up some amount of time amounting to big O of K. 6:14

A true implementation of merge_sort runs in quasi linear or 6:18

log linear time that is N times log n. 6:22

So we almost got there but we didn't. 6:24

Now in our implementation of merge_sort on a linked list, 6:28

we introduced the same problem. 6:31

So if we go back up to the split function, this is where it happens. 6:33

Now, swapping node references, that's a constant time operation. 6:39

No big deal. 6:43

Comparing values, also constant time. 6:44

The bottleneck here, like list slicing, is in splitting a link list at the mid point. 6:47

If we go back to our implementation, you can see here that we use the noted 6:54

index method which finds the node we want by traversing the list. 6:59

This means that every split operation incurs a big O 7:03

of K cost where K here is the mid-point of the list. 7:07

Effectively N by 2 because we have to walk down the list, 7:12

counting up the index until we get to that node. 7:16

Given that overall splits take logarhythmic time, our split function, 7:19

just like the one we wrote earlier, incurs a cost of big O of k log n. 7:25

So here we'll say Take Takes O(k log n). 7:31

Now the merge function, also like the one we wrote earlier, takes linear time. 7:36

So that one is good. 7:39

That one runs in the expected amount of time. 7:42

So here, we'll say it runs in linear time. 7:44

And that would bring our overall run time, so 7:48

up at the merge_sort function We can say this runs in O(kn log n). 7:51

It's okay though, this is a good start. 7:56

And one day when we talk about constant factors and look at ways 8:03

we can reduce the cost of these operations using different strategies, 8:06

we can come back and reevaluate our code to improve our implementation. 8:10

For now, as long as you understand how merge sort works conceptually, 8:15

what the run time and space complexities look like, and 8:19

where the bottle necks are in your code, that's plenty of stuff. 8:22

If you're interested in learning more about how we would solve this problem, 8:26

check out the notes in the teacher's video. 8:30

In the next video, let's wrap this course up. 8:32

You need to sign up for Treehouse in order to download course files.

You need to sign up for Treehouse in order to set up Workspace

Introduction to Data Structures

Evaluating the Runtime

Related Discussions

Related Discussions