Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Start a free Courses trial
to watch this video
Now that we know how to generally define an algorithm let's talk about what it means to have a good algorithm
Glossary
- Time Complexity - A measure of the running time of an algorithm
- Space Complexity - A measure of the storage taken up by an algorithm
Now that we know how to generally define an algorithm, 0:00 let's talk about what it means to have a good algorithm. 0:02 An important thing to keep in mind is that there's no one single way to measure 0:05 whether an algorithm is the right solution because it is all about context. 0:10 Earlier we touched on two concepts, correctness and efficiency. 0:15 Let's define correctness more clearly, because before we can evaluate 0:20 an algorithm on efficiency, we need to ensure its correctness. 0:24 Before we define our algorithms, we start by defining our problem. 0:28 In the definition of that problem we have a clearly defined input 0:32 satisfying any pre-conditions and a clearly defined output. 0:37 An algorithm is deemed correct if on every run of the algorithm, 0:41 against all possible values in the input data, we always get the output we expect. 0:45 Part of correctness means that for any possible input, 0:51 the algorithm should always terminate, or end. 0:55 If these two are not true, then our algorithm isn't correct. 0:58 If you were to pick up an algorithm's textbook and 1:03 look up correctness, you will run into a bunch of mathematical theory. 1:06 This is because traditionally algorithm correctness is proved by mathematical 1:10 induction, which is a form of reasoning used in mathematics to verify 1:15 that a statement is correct. 1:20 This approach involves writing what it called a specification and 1:22 a correctness proof. 1:26 We won't be going into that in this course. 1:27 Proof through induction is an important part of designing algorithms. 1:30 But we're confident that you can understand algorithms both in terms of how 1:33 and when to use them, without getting into the math. 1:38 So if you pick up a textbook and feel daunted, don't worry, I do, too. 1:41 But we can still figure things out without it. 1:45 All right, so once we have a correct algorithm, 1:47 we can start to talk about how efficient an algorithm is. 1:50 Remember that this efficiency ultimately matters because they help us solve 1:54 problems faster and 1:59 deliver a better end-user experience in a variety of fields. 2:00 For example, algorithms are used in the sequencing of DNA. 2:04 And more efficient sequencing algorithms allow us to research and 2:08 understand diseases better and faster. 2:13 But let's not get ahead of ourselves. 2:15 We'll start simple, 2:17 by evaluating John's linear search algorithm in terms of its efficiency. 2:19 First, what do we mean by efficiency? 2:24 There are two measures of efficiency when it comes to algorithms, time and space. 2:26 Sounds really cool and very sci-fi? 2:31 Efficiency, measured by time, something you'll hear called time complexity, 2:34 is a measure of how long it takes the algorithm to run. 2:39 Time complexity can be understood generally outside the context of 2:43 coding computers, 2:47 because how long it take to complete a job is a universal measure of efficiency. 2:48 The less time you take, the more efficient you are. 2:52 The second measure of efficiency is called space complexity, and 2:55 this is pretty computer-specific. 2:59 It deals with the amount of memory taken up on the computer. 3:01 Good algorithms need to balance between these two measures to be useful. 3:05 For example, you can have a blazingly fast algorithm, but it might not 3:10 matter if the algorithm consumes more memory than you have available. 3:14 Both of these concepts, time and space complexity, 3:18 are measured using the same metric. 3:21 But it is a very technical sounding metric, so 3:24 let's build up to it slowly and start simple. 3:27 A few videos ago, I played a game with Britney and 3:30 John where they tried to guess the number I was thinking of. 3:32 Effectively, they were searching for a value. 3:36 So how do we figure out how efficient each algorithm is and 3:38 which algorithm was more suited to our purposes? 3:42 If we consider the number of tries they took to guess or search for 3:46 the value as an indicator of the time they take to run through the exercise, 3:50 this is a good indicator of how long the algorithm runs for a given set of values. 3:55 This measurement is called the running time of an algorithm and 4:01 we'll use it to define time complexity. 4:04 In the game we played four rounds. 4:06 Let's recap those here, focusing on John's performance. 4:08 In round 1, we had 10 values, the target was 3, and John took 3 turns. 4:12 In round 2, we had 10 values, the target was 10, and John took 10 turns. 4:18 In round 3, we had 100 values, the target was 5, John took 5 tries. 4:22 And finally in round 4, when the target was 100, 4:27 given 100 values, John took 100 tries. 4:31 On paper, it's hard to gauge anything about this performance. 4:34 When it comes to anything with numbers though, I like to put it up on a graph and 4:38 compare visually. 4:42 On the vertical, or Y axis, let's measure the number of tries it took John to 4:43 guess the answer, or the running time of the algorithm. 4:48 On the horizontal, or X axis, what do we put? 4:51 For each turn we have a number of values as well as a target value. 4:55 We could plot the target value on the horizontal axis, but 5:01 that leaves some context and meaning behind. 5:04 It's far more impressive that John took 5 tries when the range went up to 100, 5:07 than when he took 3 tries for a maximum of 10 values. 5:13 We could plot the maximum range of values, but 5:17 then we're leaving out the other half of the picture. 5:19 There are data points, however, that satisfy both requirements. 5:22 If we only plot the values where the target, the number John was looking for, 5:26 was the same as the maximum range of values, we have a data 5:31 point that includes both the size of the dataset as well as his effort. 5:35 There's an additional benefit to this approach as well. 5:40 There are three ways we can measure how well John does or, 5:43 in general, how well any algorithm does. 5:46 First, we can check how well John does in the best case or 5:49 good scenarios from the perspective of his strategy. 5:52 In the range of 100 values, the answer being a lone number, 5:56 like 3 at the start of the range, is a good scenario. 6:00 He can guess it fairly quickly. 6:03 One is his best case scenario. 6:05 Or we could check how well he does on average. 6:07 We could run this game a bunch of times and average out the running time. 6:10 This would give us a much better picture of John's performance over time, but 6:14 our estimates would be too high if the value he was searching for 6:18 was at the start of the range or far too low if it was the end of the range. 6:22 Let's imagine a scenario were Facebook naively implements linear search when 6:26 finding friends. 6:30 They looked at the latest US census, saw that 50% of names start with 6:31 the letters A through J, which is the first 40% of the alphabet. 6:35 And thought, okay, on average, linear search serves us well. 6:40 But what about the rest of those whose names start with the letter after J in 6:44 the alphabet? 6:48 Searching for my name would take longer than the average and 6:49 much longer for someone whose name starts with the letter Z. 6:52 So while measuring the run time of an algorithm on average might seem like 6:55 a good strategy, it won't necessarily provide an accurate picture. 7:00 By picking the maximum in the range, 7:04 we're measuring how our algorithm does in the worse case scenario. 7:06 Analyzing the worse case scenario's quite useful because it indicates that 7:10 the algorithm will never perform worse than we expect. 7:15 There's no room for surprises. 7:18 Back to our graph, we're going to plot the number of tries, a proxy for 7:20 running time of the algorithm, against the number of values in the range, 7:25 which we'll shorten to n. 7:29 n here also represents John's worst-case scenario. 7:31 When n is 10, he takes 10 turns. 7:34 When n is 100, he take 100 turns. 7:37 But these two values alone are insufficient to really get any sort of 7:40 visual understanding. 7:44 Moreover, it's not realistic. 7:46 John may take a long time to work through 100 numbers, but 7:48 a computer can do that in no time. 7:52 To evaluate the performance of linear search in the context of a computer, 7:53 we should probably throw some harder and larger ranges of values at it. 7:58 The nice thing is by evaluating a worst-case scenario, 8:03 we don't actually have to do that work. 8:06 We know what the result will be. 8:08 For a given value of n using linear search, 8:11 it will take n tries to find the value in the worst-case scenario. 8:13 So let's add a few values in here to build out this graph. 8:17 Okay, so we have a good picture of what this is starting to look like. 8:21 As the values get really large, 8:24 the running time of the algorithm gets large as well. 8:26 Well, we sort of already knew that. 8:30 Before we dig into this run time any deeper, let's switch tracks and 8:32 evaluate Britney's work. 8:35 By having something to compare against, 8:37 it should become easier to build a mental model around time complexity. 8:39
You need to sign up for Treehouse in order to download course files.
Sign up