Introducing Arrays11:37 with Craig Dennis
Allow me to introduce you to a powerful new data structure, the array.
Arguably, one of the most important things to get familiar with in NumPy is the array 0:01 data structure that it introduces. 0:05 Arrays are around in most programming languages so 0:08 you probably heard the term floating around before. 0:10 Now don't worry if you haven't, 0:12 you'll be super familiar with the concept here shortly. 0:13 Python actually has an implementation in its standard library. 0:16 Arrays are a way of organizing your data. 0:19 An array is a data structure that allows you to store multiple items in a single 0:21 variable. 0:26 Which yes wise Python-ista, sounds an awful lot like a list, doesn't it? 0:26 But there are some crucial differences between a list and 0:31 an array, the array is much more restrictive than a list. 0:34 Which, I realize, doesn't sound like a great selling point. 0:38 But, in fact, it's actually these restrictions that make 0:41 arrays the go to solution for most Python projects involving data. 0:43 The list, as you know, can contain just about anything. 0:48 And by anything, I mean the type of elements in there don't matter, right? 0:51 A single list can contain booleans, and strings, and any random old object. 0:56 Arrays aren't like that. 1:00 Every single item in the array must be of the same data type. 1:02 That's the first major difference between the list and an array. 1:05 Now, the second one is that the array's length cannot be changed. 1:09 Its size is immutable. 1:12 You can't add items to the array. 1:14 Or remove from it. 1:16 You can change the value of individual items, but the size cannot be changed. 1:17 And it must be defined when it's created. 1:22 When we work in these higher level languages like Python, 1:24 we don't really need to think about how things are working under the covers. 1:27 That's because the language does a wonderful job 1:31 of abstracting that away from us. 1:34 Right? 1:36 What do you say we take a peek behind the curtain real quick? 1:37 When I create a new variable named age and set it to say 29, I wish. 1:41 Well what really has been abstracted away, is that Python has to go to the memory 1:46 on the computer and find some free space to store that number. 1:51 Storing that number is, actually, another abstraction, isn't it? 1:55 It's really just binary, right? 1:58 It's a series of ones and zeros that get's put in there. 2:00 And then later, when we go to access it again, our variable name references that 2:02 memory location so it can be used to go to and read that value from memory. 2:07 And because it knows the size required for its data type, it can grab the right 2:11 version of 1s and 0s, to have it represent the original value in our program, 29. 2:16 I feel like I have aged a year just explaining that. 2:21 There are lower level languages like C, 2:24 where you actually have to do all the memory management. 2:26 You have to go find the space yourself, put the value in there and 2:29 clean it up when you don't need it anymore. 2:32 Now I'm certain that you could totally do that if you had to. 2:34 But this work has been abstracted away from you, so 2:37 you can focus on more important things. 2:40 One thing you probably don't think about often is how awesome these Python lists 2:43 actually are. 2:46 You create them, you pin stuff to them, you remove stuff, you insert stuff. 2:48 That stuff can be any size, so every time something is added, the same flow of 2:51 finding the proper space with enough room to hold that data needs to be located. 2:56 And there's even more to the list abstraction here. 3:01 Each of those whatever size objects all can be iterated through, or 3:03 accessed by index. 3:08 It's pretty impressive when you stop to think about what we've been taking 3:09 for granted. 3:12 There's a lot of effort that is hidden behind that abstraction to make things 3:13 seamless. 3:17 That effort actually does have a cost. 3:18 Namely, speed. 3:21 It's a lot of overhead, and 3:23 it wouldn't be needed if only you knew how much space each element required. 3:24 As well as the total number of elements that would always remain the same. 3:29 Hey, those restrictions sound pretty familiar, don't they? 3:34 Those are the restrictions I was talking about that arrays have. 3:37 When the array is defined or declared, the only type of 3:41 element that it can store is defined, as well as how many elements will ever exist. 3:44 Therefore, the space to store the whole array structure is known at creation time. 3:49 What this does, 3:54 is it allows for each element to be stored next to the following one. 3:55 The elements are contiguous, or 3:59 one right after the other, with no space in between them. 4:01 This makes finding exactly where an item lives in memory a very easy math equation. 4:04 Much quicker than accessing an item in a list. 4:09 Let's go build an array. 4:13 Okay, I'm gonna give us a little more space. 4:15 I'm gonna click view, and I'm gonna toggle the header. 4:18 That feels better. 4:20 Let's start by building a list that we can use for a quick refresher. 4:22 So I want to to store some floating point numbers, okay? 4:26 How about we store my high school GPA scores? 4:32 For those of you out there outside where I grow up, 4:36 which is the United States, GPA stands for Grade Point Average. 4:39 And we are or at least at the time, a little over obsessed with them. 4:43 Now it's an average of all your grades in all subjects on a scale from one, 4:49 being the lowest, to four being the highest. 4:54 Okay so let's declare a new list, 4:57 called gpas_as_list. 5:01 And we'll use a hard bracket literal, for our list. 5:06 I'm doing this for high school. 5:10 Now this is the part of school before college. 5:12 It's called secondary school elsewhere. 5:14 Okay, so what did I earn. 5:16 I don't remember really, I've forgotten it cuz I didn't really use these things. 5:19 So our brains tend to forget information that we don't use, by design. 5:25 Well, I know I started off strong. 5:30 So I'm going to give myself a, 5:32 let's give me a 4.0. 5:36 And then let's see. 5:40 Math starting to get a little weird. 5:41 Remember geometry and what not. 5:43 Man, and chemistry. 5:45 I must have dropped a bit. 5:47 Let's do 3.2. 5:49 Well shucks, I'll give myself some more precision there, so let's do 3.286. 5:52 That's better. 5:57 And then we had some trigonometry It was still pretty rough. 5:59 But I started building out that math foundation. 6:04 So let's say 3.5. 6:06 Let's not even talk about senior year's physics and calculus. 6:09 Okay, fine, I'll add it, and 6:14 I can because a list size can be mutated, right? 6:17 So I can just append to it. 6:23 Let's say 4.0. 6:26 I did have a hard time with calculus, but eventually it all started making sense. 6:29 Once I got that graphing calculator working. 6:34 And I haven't thought about that sweet TI-82 since then. 6:37 I'm gonna make a little note here about this list, too. 6:41 So we'll put a comment in here. 6:43 We'll say, can have elements appended to it. 6:45 So on top of appending, I can also insert into the list wherever I want. 6:51 And, also, we can have multiple data types in this list, right? 6:57 So, let's do gpas_as_list.insert because we can put them anywhere. 7:05 And at the first position, we're gonna insert Whatevs, 7:10 cuz that's a string in there, so that's kinda strange. 7:15 And I can pop it right out, right? 7:18 So, I can say, Can have items removed. 7:19 So we can say gpas_as_list.pop(1). 7:23 And I'm gonna run this and we'll see that we got back this last line here, 7:29 this pop, whatevs is right. 7:34 So we can stick whatever in there. 7:36 So let's make sure we got everything looking like we want, awesome. 7:38 Now let's create a NumPy array. 7:45 A common way to create an array is to pass the array function an iterable, 7:48 and we have one of those. 7:53 So let's go ahead. 7:54 We'll make a brand new array, and 7:55 we're going to call it gpas = 8:00 np.array(gpas_as_list). 8:04 So now that we have one, let's see what we can do with it. 8:09 One of my favorite things about Jupiter Notebooks, 8:13 is that it lets you peep the documentation on an object. 8:15 If I just say ?gpas. 8:18 Nice, so this is an ndarray. 8:22 Some documentation here, here's some parameters, here's dtype, the data type. 8:25 And here's some attributes for it. 8:31 So dtype, dtype describes the format of the elements in the array. 8:33 Awesome, so let's just go ahead and see what we have going on here. 8:37 So if we say gpas.dtype Float64. 8:40 64 there, stands for the number of bits. 8:45 Remember, the 1s and 0s that are required to store the datatype. 8:49 Speaking of which, looks like we also have itemsize, 8:54 the memory use of each array element in bytes. 8:59 So we can say gpas.itemsize. 9:01 8, 8 bytes, that makes sense, right? 9:07 8 bits are in 1 byte, in our datatype 64 takes 64 bits. 9:11 So, 8 bytes is 64 bits. 9:18 8 times 8. 9:20 All right, so what else do we have in here? 9:22 Size, that's a great one, so let's take a look. 9:28 So we have gpas.size, 4. 9:31 And NumPy does a pretty good job of being Pythonic. 9:35 Which means most objects just work like you think they should. 9:39 Like if I wanted to know the length, 9:42 I should just be able to use the len function, right? 9:44 So we could say len(gpas). 9:47 And while not the best naming, 9:50 we can also get the number of bytes that are required for the data. 9:52 So if I say gpas.nbytes, we'll see 32. 9:56 Which of course is just the length, 4, times the item size, which is 8. 10:03 So 32, but it still nice to have. 10:10 Okay, I just threw a ton of information at you. 10:14 So let's go ahead and take a quick breather. 10:19 But before we do that, Let's do this. 10:22 Let's do a little exercise of writing up a nice summary. 10:25 You know, try to capture our train of thought. 10:29 We found that reflecting on what you just learned 10:31 will help you when you come back to it later. 10:35 I'm gonna come up here, so I'm gonna get in command mode by pressing escape. 10:38 See how it's blue? 10:41 I'm going to come up here, and I'm going to go right above this gpas list here. 10:42 And I'm going to make a new cell. 10:47 So I'm going to make a new cell above this, so A. 10:50 I'm going to press the A key, there's a new cell above it. 10:53 I'm going to press Enter, and I'm going to switch this to markdown. 10:55 And I'm going to write down some notes here, so I'll say ## for second heading. 11:00 We'll say Differences between lists and NumPy Arrays. 11:06 And during the break, I'm gonna do some reflection. 11:14 And I'll share with you what I jotted down in the next video. 11:17 But I'd like for you to give that a shot, too. 11:21 Treat it free-form, just jot down some thoughts. 11:23 And like I said, I'll share mine with you in the next video. 11:26 Sound good? 11:29 Okay, so do that. 11:30 And when we come back, we'll start our 100 Days of Code Study Log 11:31
You need to sign up for Treehouse in order to download course files.Sign up