Indexing15:54 with Craig Dennis
The ndarry data structure is quite Pythonic, so indexing works as expected. NumPy introduces some handy shortcuts and powerful indexing options. Fancy.
A different solution
There are actually a couple of ways to add a new axis to your array other than wrapping it with brackets like this
Another popular solution is to use the
np.newaxis property. So the code would look something like this:
# Slice all the rows and add a new axis fake_log[:, np.newaxis]
My Notes for Multidimensional Arrays
## Multidimensional Arrays * The data structure is actually called `ndarray`, representing any **n**umber of **d**imensions * Arrays can have multiple dimensions, you declare them on creation * Dimensions help define what each element in the array represents. A two dimensional array is just an array of arrays * **Rank** defines how many dimensions an array contains * **Shape** defines the length of each of the array's dimensions * Each dimension is also referred to as an **axis**, and they are zero-indexed. Multiples are called **axes**. * A 2d array is AKA **matrix**.
[MUSIC] 0:01 Well, it happened to me. 0:04 I committed to code an hour every day for the 100 day code challenge. 0:07 And sure enough, I didn't make it. 0:11 I missed two days in a row thanks to a Netflix binge. 0:13 We've all been there. 0:15 And sure enough though, technically I broke my commitment. 0:17 I was pretty bummed about it. 0:20 I was really enjoying the challenge. 0:22 Now, the good news is, 0:23 you can start your challenge over without losing your previous wins. 0:25 What you do is you just start a new round. 0:28 [SOUND] Now, if you'd started the challenge, you might have seen people 0:30 using that syntax in their tweet, r1d6, which stands for round one, day six. 0:34 But that brings up an interesting point. 0:39 How would we represent that in our study log array? 0:41 Currently each of our columns is used to represent a day. 0:44 Remember, it's a 100 element long array filled with zeros. 0:47 So how could we also track rounds? 0:51 You know what? 0:55 You're right, we just need a new dimension, don't we? 0:55 Good thinking. 0:57 Come help me out with this so I can get back to studying. 0:59 Before we get to upgrading our study log, 1:01 I'm going to review my reflections of multidimensional arrays. 1:04 I feel like I forgot a bunch of the stuff while I was binging that Netflix series. 1:08 Good thing I wrote it all down for my own recall. 1:12 So the data structure is actually called an ndarray, 1:15 representing any number of dimensions. 1:18 That's where the nd comes from. 1:21 Right, and arrays can have multiple dimensions. 1:22 And you declare them when you create them, cool. 1:25 Dimensions help define what each element in the array represents. 1:28 So a two-dimensional array is just an array of arrays, 1:33 like multidimensional lists. 1:37 Rank defines how many dimensions an array contains. 1:39 And shape defines the length of each of the array's dimensions. 1:43 Remember that was like three, four that we saw. 1:47 And each dimension is also referred to as an axis. 1:50 And they are zero indexed. 1:53 Multiples are called axes even though it looks like Paul Bunyan's axes. 1:55 And a two-dimension array is also known as matrix. 2:00 Awesome. 2:05 I am refreshed and ready to go. 2:06 Now there's a couple of ways to add a new dimension. 2:08 Let's explore one of them. 2:11 Let's get back down to where our study log is. 2:13 Here we go. Here's our study_minutes array. 2:17 Go ahead and, I think, 2:21 get rid of this where we print it out one more time and run that one more time. 2:24 There we go. 2:28 So let's make a brand new array which consists of our array and 2:31 a new row of zeroes. 2:36 So we could do that pretty easily. 2:38 We'll just, we'll reassign, we'll relabel. 2:40 We'll remove the study_minutes label from the other one and put it on this new one. 2:42 We'll say np.array. 2:46 And remember arrays accept iterables. 2:49 And our study_minutes is iterable. 2:53 So we'll say study_minutes here. 2:55 So that's an array of an array. 2:59 So we'll do an np.zeros(100). 3:01 And we'll do np.uint16. 3:04 Because that's what the original value was there. 3:08 And let's go ahead and run it. 3:11 It seems like it worked. 3:14 Let's go ahead, let's check the shape of that. 3:15 So, we'll say study_minutes.shape. 3:17 Awesome. 3:20 (2, 100). 3:21 So, our first axis, 2, will represent the round. 3:23 And our second axis will recommend the day. 3:27 So this 100 here, right? 3:31 So this is axis zero and axis one. 3:32 We have two dimensions, and our array is now of rank two. 3:36 So I did in fact complete an hour on that first day of round two. 3:40 So that's R2D1. 3:44 Do you remember how to access that element? 3:47 Why don't you go ahead and do that? 3:50 Why don't you set round 2 day 1 to 60? 3:52 So pause me and give it a go. 4:00 When you're ready, unpause me. 4:02 Ready? 4:05 Now remember it's 0 base. 4:06 I hope you didn't fall for that. 4:08 I hate trick questions. 4:09 So let's study_minutes. 4:10 And we want to get the second round, so one. 4:14 And then we want the first day, so that's zero. 4:17 And it was 60 that we're gonna set that to. 4:20 And then let's go ahead and let's peep it. 4:23 Let's see what we've got going on here. 4:25 Awesome, so this is the first round, right? 4:27 The first round goes all the way there and here's the second round. 4:29 There's the 60 that we got in there, great. 4:31 We did it, I'm going to come back up. 4:34 I'm going to get rid of this displaying of it. 4:36 And we will run that one more time to get our space back. 4:39 So, you did it, great. 4:42 But this syntax, all these brackets, that's kind of clunky, isn't it? 4:45 Well, NumPy introduces a fun shortcut of a comma. 4:50 So if we wanted to get the value from here we could just say study_minutes. 4:55 And we could say [1,0], which is the second row and the first column. 5:00 Let's see if that works and it does. 5:06 It feels better, right, and it's kinda cute. 5:09 Now what's happening here, it's a little subtle. 5:11 It might look like magic, especially if you haven't seen this before. 5:14 But if you separate values with a comma, 5:17 like we did here just in plain old Python, check it out, a tuple is assumed. 5:20 So really this is just a tuple in here. 5:25 But we don't want to add all of those, these prints. 5:29 We don't need them. 5:33 So, it looks nicer, like this. 5:34 It's easier to read. 5:36 So, in this case, each of these values represents an index to each of the axes. 5:38 But, there's another way that these indices can work. 5:45 And I want to bring this up now because it can be a bit confusing if it catches you 5:48 off guard. 5:52 There's a bunch of stuff that happens within these hard brackets and 5:53 we haven't even got to slicing yet. 5:56 It's going to sound like I'm making this up. 5:58 But I want to talk to you about a special type of indexing called fancy indexing. 6:00 In order to show this off, 6:08 I'd like to start with an array that is a single dimension. 6:09 And since our example arrays are now both two-dimensional, 6:11 I thought I'd show off a handy tool for creating some random data to explore. 6:14 Let's do this. 6:19 Let's make a new study log. 6:20 But instead of just zeros, let's make it contain fake but reasonable data. 6:22 Now this random trick is a great way to create data to play with. 6:28 So here we go. 6:34 So you can actually create a random number generator and 6:35 seed it in a way that will always produce the same results. 6:41 That way we can have the same random data, you and me. 6:47 The seed value. 6:51 So let's do this. 6:52 So it's np.random and we're gonna create a thing called RandomState. 6:53 And we'll pass it a seed value. 6:59 And the seed value I'll use, let's use 42. 7:03 You know, the answer to life, the universe, and everything. 7:05 And now, let's make a new fake log. 7:09 We'll make a round that was just totally fake. 7:12 And we'll use random integers. 7:14 So we'll say fake_log = rand.randint, 7:15 cuz we want an integer. 7:22 And we'll make the first parameter, it's the low value, 7:26 like how low should it ever go. 7:29 And we want it between 30 and then a high of 180. 7:30 That's three hours. 7:35 That's tons of studying. 7:36 And then we'll set the size to be 100. 7:38 Say size=100. 7:41 And we'll set the data type or dtype to the same thing that we've been using, 7:43 uint16, the unsigned integer. 7:48 Let's go ahead and do that, and then let's peep it. 7:51 We'll do fake_log. 7:53 Here we go. 7:56 [LAUGH] This is nice. 7:57 Aren't you glad we didn't have to type out all of these values? 7:58 Aren't you glad we didn't have to study all those minutes? 8:01 Our random values here have some that went under the hour. 8:03 So remember, you need at least 60 minutes to call the day a success. 8:07 So let's say that I wanted to grab a few of these, just to point them out, 8:12 so I can encourage myself to reach the full hour, 8:16 ones that are super close to the hour, but didn't quite reach the goal. 8:19 I could use some motivation. 8:22 I may want to build a list of these values. 8:24 So one thing that I could do is just build a list using standard indexing. 8:26 So let's see, we wanna have fake_log. 8:31 And we have to have this value, this 44, so this is 0, 1, 2, 3. 8:36 So we have 3. 8:42 And then we want the next value, 4, 5, 6, 7, 8. 8:43 50 that was so close. 8:49 So we'll do fake_log. 8:51 And then that's 8, right? 8:55 You should have a list of 44 and 50. 8:57 There we go. 8:58 Those are so close to making that hour. 9:00 But this indexing, it's pretty basic, right? 9:02 There is nothing fancy about it. 9:06 It's just pulling some values out. 9:09 Well, lift your finger like you would if you were drinking tea 9:10 as you type out this next one, cuz things are about to get fancy. 9:15 What you can do is type fake_log. 9:20 And in here we're gong to use hard brackets, say 3, 8, 9:24 and take a look at what we got back. 9:29 There is our 44 and our 50. 9:34 It's the same type of the values that we just selected. 9:36 Pretty fancy, right? 9:41 So again, using a list as a parameter here is saying return 9:43 me a new array with the values from index three and eight. 9:48 And check this out, it get's fancier. 9:53 The shape of the resulting array can be defined by the shape of the index array. 9:56 So this is just a two-element array, right? 10:02 Let's build an index array in the shape that we want. 10:04 So we'll make a 2x2 matrix. 10:08 So let's do that. 10:12 So we'll say the index, it can be named anything. 10:13 We'll declare an array of lists of lists. 10:16 So here we go, np.array. 10:20 We'll say 3, 8 like we had before. 10:21 So we had those two values. 10:23 And let's just go ahead and grab the first two values for the second row. 10:24 So we will have zero and one. 10:27 And then we can use this array for our fancy index. 10:30 So we say fake_log[index]. 10:33 And you'll see back we got our 44 and our 50. 10:37 And we'll get 132 and 122, which I'm pretty sure are the first two, 0 and 1. 10:39 So we got 44, 50 and 132, 122. 10:44 And we pulled it out in a special order. 10:46 And you'll notice that the shape of the array is the same as 10:48 the index array that we wanted. 10:51 Ooh la la, c'est chic. 10:53 Now that we already have our study group log in the right rank, we already have two 10:53 dimensions, let's just go ahead and append this fake_log to the study group array. 11:00 We'll just redefine the study log because remember these array sizes 11:05 are always immutable. 11:10 So we can't just add new elements to them. 11:12 We need to create a new one. 11:15 So there's a method right off the NumPy module called append. 11:16 So we're going to say study_minutes because we're going to override that 11:20 again, equals np.append. 11:23 And what it does is it takes an array, our study_minutes array. 11:27 And it takes what you want to append. 11:32 So I guess we want to append our fake_log. 11:35 And then you tell it what axes to append, and we want that on axis zero. 11:40 I've left an intentional bug here for you to see. 11:47 Let's go ahead, let's run this. 11:49 Yikes. 11:51 Look at this mess. 11:52 All the input arrays must have the same number of dimensions. 11:53 Now this is a common Stack Overflow question. 11:57 So I figured I'd save you some time in the likelihood that you might run 11:58 into a problem like this. 12:01 So the solution here is just to make sure that the dimensions line up. 12:03 So we actually need two dimensions, right? 12:07 Our fake_log currently is just one dimension. 12:10 So one hack is you could just make this a list of lists. 12:13 There we go. 12:19 And if we take a look at our array now We'll see that it has the shape 12:21 Of three arrays, awesome. 12:31 So looking at this, I see that this one here, this is round 2, 12:35 day 2, or R2D2, everyone's favorite malfunctioning droid. 12:41 We absolutely have to set that. 12:47 That is one of the most favorite days of these rounds. 12:49 So let's set that one. 12:53 So that would be study_minutes[1,1] = 360, right? 12:54 Round two, day two equals 360. 13:02 Now note here we aren't using a fancy index. 13:06 We are using a comma or tuple shortcut. 13:09 It does get confusing so you really need to pay attention to what's 13:13 going on between these outer brackets, right? 13:16 You need to see if it's a tuple or it's a list. 13:18 In fact, you can even do multidimensional fancy indexing 13:21 by specifying fancy indexes for each dimension. 13:25 Now, it’s a bit out of the scope for this course. 13:28 So make sure to check the teacher’s notes for more on that. 13:30 And now, looking at our study log, this is about right. 13:33 Let's go ahead, let's run this one more time. 13:37 I'm going to go ahead, I am going to go in command mode. 13:42 I'm going to press dd to delete that. 13:43 And let's show our study minutes one more time. 13:46 This looks just about right to me. 13:52 This first round here, I let things slip. 13:54 Right, I got to here. 13:57 I did that dang Netflix binge of Stranger Things. 14:00 And then I worked a bit too hard on the second round, right? 14:03 I did 360 minutes on this day and I burned myself out. 14:06 But I kept on trucking and the third time's a charm. 14:10 Feels pretty realistic to me. 14:14 This is probably what a lot of people's rounds look like. 14:16 You have to just keep trying and eventually you'll get it. 14:18 No matter how many times I learn that lesson, I need to keep reminding myself. 14:21 Stick with it and you'll get it. 14:25 Now, I don't know about you, but some of these minutes on days where I almost made 14:27 it, like where they were almost an hour but not quite like we saw, they would 14:30 really help me keep going during the times where I was struggling to stick with it. 14:34 Wouldn't it be cool if we could just somehow pick out the values 14:38 just under 60 minutes to motivate ourselves, but 14:41 not have to pick them out by hand, visually, like we did? 14:45 Well you can, and let's do that right after this quick break. 14:48 Ooh, and also, I'm gonna write out some notes on our array creation and 14:51 appending indexing skills. 14:55 So let's come up here to where we started doing this. 14:57 Let's go right here. 15:06 We're gonna go ahead. 15:07 And I'm gonna get into command mode, go one above. 15:08 Change this over here to markdown. 15:12 And I wanna talk about array creation. 15:15 Then let's also talk about indexing over here. 15:20 So in the creation let's talk about random state. 15:23 And well talk about how to append rows. 15:30 And then indexing wise let's talk about the shortcut tuple one, right? 15:33 We'll talk about the tuple shortcut and I also wanna talk about fancy indexing. 15:39 Wow, that was a lot that we covered, good job. 15:43 I'm gonna swing back after a quick break and flesh this out for better notes. 15:46 And then I'll share them with you. 15:50 Why don't you do the same, and I'll see you in a bit. 15:51
You need to sign up for Treehouse in order to download course files.Sign up