Routines in Action11:23 with Craig Dennis
There are a ton of routines that are already written for you, saving the work for you. Let's take a look at how to find them. Also let's dive into reduction a bit.
My Notes for Universal Functions
## Universal Functions * [ufuncs](https://docs.scipy.org/doc/numpy/reference/ufuncs.html) are commonly needed vectorized functions * Vectorized functions allow you to operate element by element without using a loop * The standard math and comparison operations have all been overloaded so that they can make use of vectorization * Values can be [broadcasted](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html), or stretched to be applied to the ufuncs.
So here's my notes on ufuncs, or universal functions. 0:00 They are commonly needed in vectorized functions, again, 0:04 which allow you to operate element by element instead of using a loop, and 0:07 standard mathing comparison operators like plus, minus, multiply, and 0:12 greater than, greater than equal to they've all been overloaded so 0:16 that they can make use of vectorization, and values can be broadcasted or 0:20 stretched to be applied to the vector. 0:24 So, remember that two got stretched all the way across the scaler, or 0:27 we did it by rows. 0:30 Awesome, so we saw some super powerful ufuncs, and let's go take a look at 0:31 some higher level routines that make use of them for common tasks. 0:36 Now, all this talk of trigonometry is making me want to go back and 0:41 take a look at one of those first multi-dimensional arrays that we created, 0:44 that students_gpas. 0:49 That was way up here at the top, wasn't it? 0:51 Let's get back up here. 0:53 We've done a lot in this course. 0:56 All right, so, here we go. 0:58 Here's our students_gpas. 1:00 Let's go ahead, let's take a look again one more time at what that is. 1:01 So we'll say students_gpas. 1:06 Right, so the zero if row of this is me, 1:13 and then we had Vlada, and then we had Quesy. 1:16 Awesome. 1:21 One thing that we can do is we can find out the average or mean of this data. 1:23 So the way that you do that is just call a function on it. 1:28 Say students_gpas.mean. 1:32 Whoops. 1:35 [LAUGH] That returned all of our scores averaged together, 1:37 which 3.805 is not bad for our cohort average, 1:41 however, I was hoping to get the mean of each row of these students. 1:44 Now, the great news is that there is an access argument that we can pass and 1:49 it will do what we want. 1:54 The parameter though has been known to trip people up, so 1:57 let's focus a bit on the issue. 2:00 So, we have a two-dimensional array. 2:01 Our first dimension is students, and 2:05 our second dimension is of GPA by year in school. 2:08 So we want to have the mean of the second dimension. 2:12 We want this dimension, this is what we want, the gpas is what we want. 2:19 So that would be axis one. 2:23 Remember that they are zero based, so it's access zero, is the other way, so 2:26 access one is this. 2:29 So let's go ahead and do that. 2:30 Let's say, the students_gpas.mean(axis=1). 2:32 And since we've got three results here, and we only have three students, 2:43 we know that it did the right thing. 2:46 It went across and did the average there, so there is 3.69. 2:47 Let's go ahead and say 3.7, and then there's 3.75 and 2:50 3.97, and by the way, that 3.7 didn't really mean anything. 2:55 Now a common mistake is that people think that they want to work with each row, so 3:00 they choose the axis zero, but really what happens with axis zero, lets go ahead and 3:06 do that, we'll say (axis=0), is it ends up going this way, right? 3:13 So it's averaging axis this way, cuz it's reducing the values. 3:18 It's summing all these values up, but we want to go this way, so 3:24 when you think about the axis, remember it's what direction you're moving in. 3:26 Totally common hiccup. 3:30 Just remember to imagine the function happening across the dimension. 3:32 Now you might want this sometimes though right? 3:36 This (axis=0) will give you the average of all students by year. 3:40 That's what you want, and then if you want to you can do (axis=1), and 3:45 it gives you average of all years by student. 3:50 This type of function is known as a reduction operation. 3:55 The function reduces a set of values down to one. 3:58 The concept is that there is a function that takes two values, 4:02 a total value of all operations and the next value in the array like object. 4:04 It performs the operation and 4:11 returns the total to be used in the next iteration, recursively. 4:12 It might sound complicated, but it's actually what you would do in your head if 4:16 I asked you to add up all the values in this list. 4:20 It's probably easier to just see it in action, so let's do it. 4:22 All functions that are ufuncs, have the ability to do this, built into it. 4:26 Here, let's go back down to the hundred days of code study minutes list. 4:31 Where is this at? 4:36 Let's go down to where we have the very last one of them. 4:37 Here we go, study_minutes list, here we go. 4:44 All right, so I'm going to add one below this. 4:50 Remember that our study_minutes array is a two dimensional array. 4:55 The first dimension represents rounds or attempts, and 4:58 the second dimension is the minutes per day, and there are 100 days, so 5:02 let's simplify things first by using a single dimension. 5:06 I'm gonna grab the first round here, so we'll say study_minutes. 5:09 Now, if I asked you to total these minutes up, I bet you'd just start adding, and 5:15 remembering like this. 5:20 You'd say okay, so 150 + 60, that's 210, and 5:21 now I go 210 + 80, that's 290, and then I take the total of 290 and 5:26 I add 60 to get 350, and so on, and so on, and so on. 5:32 That is reducing in a nutshell. 5:36 If we continue all the way through the array, we'll have a total. 5:39 Now, I said that all ufuncs had the ability to do this reduction, and 5:42 the way they provide this functionality is by exposing some functions 5:46 off of the ufunc itself. 5:50 That sentence was pretty funky. 5:52 What we were doing was adding all the values up. 5:54 In that case, the ufunc that we would like to use is add. 5:57 So, let's do it. 6:01 So we'll say np.add.reduce, and 6:03 then we'll pass in our array. 6:08 And there it is, 440, and it did just like we were doing. 6:14 If you want to actually see each step, there is a function for 6:19 that available too on each ufunc. 6:22 So np.add.accumulate, and this will show you each step through. 6:24 So if we do, again, if we do study_minutes, 6:27 we'll see that we have 150, 210, 290, 350, and 6:36 then actually you'll see all of the zero adds that we had to do, and 6:39 yikes, you can see the waste of time that we made this do by adding all the zeros. 6:45 We could have filtered them out. 6:49 More in the teacher's notes. 6:50 Now, we want to get the sum of all these values together and 6:51 there is of course a routine that's super common, and it is called, sum. 6:55 So if we just make this, well let's make a new one, 7:00 we'll save that there for us, np.sum(study_minutes). 7:04 We'll see that we get 440, which is exactly what we did when we did 7:10 the reduce, and the reduction works on multi dimensions as well. 7:14 So we can just say np.sum(study_minutes), and 7:19 it will get, wow, 10,000 hours, must be a pro. 7:23 I think that's what Macklemore said, or Malcom Gladwell, 7:28 I can't remember which one said that. 7:31 Reduction functions will almost always define an access parameter. 7:33 So in this case we want to see the sum of all minutes by round. 7:38 So, that's axis=1, And 7:42 there we know that we did it right, because there are three results turn back, 7:48 and 440 was what we get out when we're getting for the first one. 7:51 Awesome. 7:54 Pretty handy, right? 7:55 And as you can imagine that mean function that we were just using is probably 7:57 using this sum function under the covers since, to calculate the mean, what 8:02 you do is you add all of the values and then divide by the total amount of values. 8:07 But, what's nice is that you don't need to remember that formula. 8:11 Even though it is simple, 8:16 it's been extracted away from you by simply calling the mean function. 8:18 You'll find that there are lots of formulas extracted away from you in 8:22 the library. 8:26 In fact, let's pop over real quick to another popular page in the documentation. 8:27 I'm just gonna Google statistics numpy. 8:32 Here we go. 8:40 There are tons of functions available for you here. 8:42 Since it's statistics, a bunch of these are reduction-based. 8:49 They reduce all the values down to one, and here's one that you'll see everywhere, 8:52 std, and while it's actually known to spread itself around, it's short for 8:57 standard deviation. 9:01 Here, let's pop in, so 9:03 it computes the standard deviation along the specified access. 9:04 The measure of the spread of a distribution, which is great. 9:08 And if we scroll down here in the notes, 9:12 we can see that this is what has been calculated. 9:15 This is the formula. 9:19 Now, I kinda remember doing that in math class, 9:21 but the point is here, you don't need to know how to calculate it. 9:23 You want to a why to use it, 9:28 as we discussed when we first introduced the grade point averages or GPAs. 9:30 People struggle with math concepts when they are first introduced to them, and 9:34 I'm under the belief it's the memorization of the formula that most people 9:38 struggle with. 9:43 Typically, that's what you're tested on, 9:44 not the actual way to use the function in the real world. 9:46 Current learning science says that if you don't use it, you will lose it. 9:49 So if these equations feel a bit rusty and you haven't used them recently, 9:53 don't fret, you're brain is just working correctly. 9:57 Most of the time, you hardly get a chance to see why in your math class. 10:00 You just focused on the how. 10:04 So with that said, let's pop up a couple levels here in our bookmarks to Routines. 10:06 This page here is a really great overview of how powerful this library is, 10:14 and a great look at some common abstractions. 10:20 So if we look down here, here is some Discreet Fourier Transforms. 10:23 Here's some financial functions, linear algebra, 10:30 input and output, logic functions, 10:35 polynomials, statistics, there's a lot in here. 10:39 Remember, you don't need to know all of these, 10:44 just be aware that what you are trying to do most likely already exist. 10:47 As you can see, 10:52 there are tons of directions that you can head with this library. 10:52 So stand on the shoulders of giants who built things out for you. 10:56 We talked way back when about how all sorts of different libraries accept and 11:01 return numpy arrays. 11:06 Let's take a quick break and 11:08 take a look at one common use case, plotting values on a graph. 11:09 Well, that is, right after we jot down some notes. 11:14 Why don't you talk a bit about some common routines that you saw, and 11:17 talk a bit about reduction. 11:20
You need to sign up for Treehouse in order to download course files.Sign up