Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
Now that we have an implementation of merge sort on two data structures how do they compare? Is one "better"?
First things first.
0:00
Let's test out our code.
0:01
Now we'll keep it simple because
writing a robust verified
0:03
function would actually
take up this entire video.
0:08
Instead, I'll leave that
to you to try as homework.
0:12
At the very end let's
create a new linked list.
0:16
Let's add a few nodes to this so l_add.
0:20
I'm gonna copy paste this so
0:23
it makes it easier to me to
not to have to retype a bunch.
0:26
So I'll add 10, and then say 2,
44, 15, and something like 200.
0:30
Then we'll go ahead and print l so
that we can inspect this list.
0:37
Next let's declare a variable here so
we'll call this sorted_linked_list.
0:42
And to this we're going to assign
the result of calling merge sort on l and
0:49
then we'll print this.
0:54
So sorted_linked_list.
0:56
Since we've taken care of all the logic,
we know that this gets added in as nodes.
0:58
And then, let's see what this looks like.
1:05
Hit save and then bring up the console.
1:08
We're going to type up python
linked_list_merge_sort.py and then enter.
1:11
We see that linked list we first created.
1:19
Remember that what we add first,
that eventually becomes the tail.
1:21
10 is the tail, 200 is the last one.
1:27
So 200 is the head
because I'm calling add.
1:30
It simply adds each one
to the head of the list.
1:33
So here we have 10, 2, 44, 15,
and 200 in the order we added.
1:36
And then the sorted_linked_list
sorts it out.
1:40
So it's 2, 10, 15, 44, and 200.
1:43
Look at that, a sorted linked list.
1:46
Let's visualize this from the top.
1:49
We have a linked list containing 5 nodes,
with integers 10, 2,
1:52
44, 15, and 200 as data, respectively.
1:57
Our merge sort function
calls split on this list.
2:01
The split function calls
size on the list and
2:04
gets back 5 which makes our mid point 2.
2:07
Using this midpoint, we can split
the list using the node at index method.
2:09
Remember that when doing this,
we deduct one from the value of mid.
2:15
So we're going to split here
using an index value of 1.
2:18
Effectively this is the same, since
we're starting with an index value of 0,
2:22
1 means we split after node 2.
2:27
We assign the entire list to left half,
then create a new list and
2:29
assign that to right half.
2:33
We can assign node 3 at index value
2 as the head of the right list,
2:35
and remove the references
between node 2 and node 3.
2:40
So far so good, right?
2:44
Now we're back in the merge_sort
function after having called split, and
2:46
we have two linked lists.
2:50
Let's focus on just the left half,
because if you go back and
2:52
and look at our code, we're going to call
merge_sort on the left linked list again.
2:55
This means the next thing we'll do
is run through that split process.
3:01
Since this is a linked list containing
two nodes, this means that split
3:04
is going to return a new left and
right list, each with one node.
3:08
Again, we're back in
the merge sort function,
3:12
which means that we call merge
sort on this left list again.
3:14
Since this is a single node link list,
on calling merged sort on it,
3:19
we immediately return before we split
since we had that stopping condition.
3:23
So we go to the next line
of code which is calling
3:28
merge sort on the right list as well.
3:31
But again, we'll get back immediately
because we hit that stopping condition.
3:33
Now that we have a left and right that
we get back from calling merge sort,
3:36
we can call merge on them.
3:41
Inside the merge function,
we start by creating a new linked list and
3:42
attaching a fake head.
3:47
Then we evaluate whether the left or
the right head is none.
3:48
Since neither condition is true,
3:52
we go to the final step where we
evaluate the data in each node.
3:54
In this case, the data in the right
node is less than the left node.
3:58
So we assign the right node as the next
node in the merge link list and
4:02
move the right head pointer forward.
4:06
In the merge link list,
we move our current pointer forward
4:08
to this new node we've added and
that completes one iteration of the loop.
4:12
On the next iteration,
4:17
right head now points to none
since that was a single node list.
4:19
And we can assign the rest of
the left linked list which
4:23
is effectively the single node
over to the merge link list.
4:26
Here we discard the fake head, move the
next node up to be the correct head and
4:30
return the newly merged sorted link list.
4:36
Remember that this point
because right head and
4:39
left head pointed to none
are wild loop terminated.
4:42
So in this way, we recursively split and
4:45
repeatedly merge sublists until we're
back with one sorted linked list.
4:47
The merge sort algorithm is
a powerful sorting algorithm.
4:52
But ultimately,
it doesn't really do anything complicated.
4:56
It just breaks the problem down
until it's really simple to solve.
4:59
Remember the technique here which
we've talked about before is called
5:04
divide and conquer.
5:07
So I like to think of
merge sort in this way.
5:08
There's a teacher at
the front of the room and
5:11
she has a bunch of books that she
needs to sort into alphabetical order.
5:13
Instead of doing all the work herself,
she splits that pile into two and
5:17
hands it to two students at the front.
5:20
Each of those students
split it into two more, and
5:23
handed it to the four
students seated behind them.
5:26
As each student does this eventually
a bunch of single students has two books
5:28
to compare.
5:33
And they can sort it very easily and hand
it back to the student who gave it to them
5:34
in front of them who repeats
the process backwards.
5:38
So ultimately, it's really simple work.
5:41
It's just efficiently delegated.
5:43
Now back to our implementation here.
5:46
Let's talk about run time.
5:49
So far other than the node
swapping we had to do,
5:50
it seems like most of our
implementation is the same right?
5:53
In fact it is, including the problems that
we run into in the list version as well.
5:57
So in the first
implementation of merge sort,
6:01
we thought we had an algorithm that ran in
big O of N login, but turns out we didn't.
6:04
Why?
6:10
The Python list slicing operation,
if you remember,
6:11
actually takes up some amount
of time amounting to big O of K.
6:14
A true implementation of
merge_sort runs in quasi linear or
6:18
log linear time that is N times log n.
6:22
So we almost got there but we didn't.
6:24
Now in our implementation of
merge_sort on a linked list,
6:28
we introduced the same problem.
6:31
So if we go back up to the split function,
this is where it happens.
6:33
Now, swapping node references,
that's a constant time operation.
6:39
No big deal.
6:43
Comparing values, also constant time.
6:44
The bottleneck here, like list slicing, is
in splitting a link list at the mid point.
6:47
If we go back to our implementation,
you can see here that we use the noted
6:54
index method which finds the node
we want by traversing the list.
6:59
This means that every split
operation incurs a big O
7:03
of K cost where K here is
the mid-point of the list.
7:07
Effectively N by 2 because we
have to walk down the list,
7:12
counting up the index
until we get to that node.
7:16
Given that overall splits take
logarhythmic time, our split function,
7:19
just like the one we wrote earlier,
incurs a cost of big O of k log n.
7:25
So here we'll say Take Takes O(k log n).
7:31
Now the merge function, also like the one
we wrote earlier, takes linear time.
7:36
So that one is good.
7:39
That one runs in
the expected amount of time.
7:42
So here, we'll say it runs in linear time.
7:44
And that would bring our overall run time,
so
7:48
up at the merge_sort function We
can say this runs in O(kn log n).
7:51
It's okay though, this is a good start.
7:56
And one day when we talk about
constant factors and look at ways
8:03
we can reduce the cost of these
operations using different strategies,
8:06
we can come back and reevaluate our
code to improve our implementation.
8:10
For now, as long as you understand
how merge sort works conceptually,
8:15
what the run time and
space complexities look like, and
8:19
where the bottle necks are in your code,
that's plenty of stuff.
8:22
If you're interested in learning more
about how we would solve this problem,
8:26
check out the notes in
the teacher's video.
8:30
In the next video,
let's wrap this course up.
8:32
You need to sign up for Treehouse in order to download course files.
Sign up