1 00:00:00,470 --> 00:00:02,360 You know how when you don't use a skill for 2 00:00:02,360 --> 00:00:06,075 a while you need to practice it a bit before you can get right back to doing it 3 00:00:06,075 --> 00:00:09,080 flawlessly, like no matter how good you ever were at it. 4 00:00:09,080 --> 00:00:11,699 It's where that saying use it or lose it comes from. 5 00:00:11,699 --> 00:00:13,932 Well for me, in addition to juggling, 6 00:00:13,932 --> 00:00:17,010 that skill I need to dust off is slicing an object. 7 00:00:17,010 --> 00:00:20,435 You've probably used slices before on lists or tuples, but 8 00:00:20,435 --> 00:00:24,478 if you haven't done it for a while, you could be a little [SOUND] klutzy. 9 00:00:24,478 --> 00:00:26,530 This is where practice comes in. 10 00:00:26,530 --> 00:00:29,710 And I want to encourage you to practice and test things out. 11 00:00:29,710 --> 00:00:33,210 Let's practice a bit of slicing with lists first just to warm up. 12 00:00:33,210 --> 00:00:35,910 And then let's practice with some multi-dimensional arrays. 13 00:00:35,910 --> 00:00:40,015 It's like starting with these practice juggling balls instead of, say, chainsaws. 14 00:00:42,000 --> 00:00:46,820 Jupiter notebooks are excellent for this sort of practice but actually real quick. 15 00:00:46,820 --> 00:00:48,920 Before we get into our slice practicing, 16 00:00:48,920 --> 00:00:53,400 let's make use of another great feature of notebooks, notes. 17 00:00:53,400 --> 00:00:56,510 So here are mine on boolean array indexing. 18 00:00:56,510 --> 00:01:01,759 So you can create a boolean array by using a comparison operator on an array. 19 00:01:01,759 --> 00:01:04,831 And you can use boolean arrays for fancy indexing like we saw, 20 00:01:04,831 --> 00:01:07,560 kinda like a where clause there from SQL. 21 00:01:07,560 --> 00:01:11,433 And boolean arrays can be compared using bitwise operators, that's and, 22 00:01:11,433 --> 00:01:13,197 and then the pipe sign there is or. 23 00:01:13,197 --> 00:01:15,904 And remember, don't use the and keyword, and 24 00:01:15,904 --> 00:01:18,555 also remember to use the order of operations. 25 00:01:18,555 --> 00:01:21,300 Otherwise, you get that really weird value error that we saw. 26 00:01:21,300 --> 00:01:26,094 And even though boolean indexing returns a brand new array, 27 00:01:26,094 --> 00:01:32,570 like a copy, you can update an existing array by using a boolean index. 28 00:01:32,570 --> 00:01:35,620 So let's go down to the bottom here, the very last row, and 29 00:01:35,620 --> 00:01:37,884 if you want to clean some of these up you can. 30 00:01:37,884 --> 00:01:42,190 I'm going to get rid of some these, just DDD in command mode there. 31 00:01:42,190 --> 00:01:45,302 All right, so let's make a new list. 32 00:01:45,302 --> 00:01:48,478 My go to list is this fruit. 33 00:01:48,478 --> 00:01:54,869 We'll go ahead, we'll get apple, banana, cherry, and durian. 34 00:01:58,249 --> 00:02:02,480 All right, now let's warm up those slicing skills. 35 00:02:02,480 --> 00:02:07,416 Now, for some reason, I can never remember if slices are inclusive or exclusive. 36 00:02:07,416 --> 00:02:10,930 So when I am not sure, one thing I like to do is just try. 37 00:02:10,930 --> 00:02:12,750 You're not going to break anything, right? 38 00:02:12,750 --> 00:02:15,770 All right, so let's get a slice of this list. 39 00:02:15,770 --> 00:02:21,610 I want to get a portion of this list, like just the second and third value. 40 00:02:21,610 --> 00:02:25,670 So let's see, I know that things are 0 based. 41 00:02:25,670 --> 00:02:30,190 So I know that I want to get the second one, I'm going to start at 1. 42 00:02:30,190 --> 00:02:33,230 And then I want a colon signifying up to. 43 00:02:33,230 --> 00:02:35,900 And now, is it inclusive or exclusive? 44 00:02:35,900 --> 00:02:38,240 I don't know, I'll try inclusive. 45 00:02:38,240 --> 00:02:40,200 So, let's see 1 to 2. 46 00:02:40,200 --> 00:02:43,691 No, it is exclusive, [SOUND] I missed that by much. 47 00:02:43,691 --> 00:02:50,440 So if we come back and we say 1 to 3, we'll see that we get banana and cherry. 48 00:02:50,440 --> 00:02:53,640 So it's like up to but not including. 49 00:02:53,640 --> 00:02:58,285 That's what you can read with this is like 1 up to but not including 3, phew. 50 00:02:58,285 --> 00:03:03,276 All right, and then again if you leave either side blank, so if we say fruit and 51 00:03:03,276 --> 00:03:04,220 we go up to 3. 52 00:03:04,220 --> 00:03:08,718 We'll get everything up to the third one. 53 00:03:08,718 --> 00:03:13,750 Which, again, not the 3rd one, the 4th one, this is 0 based. 54 00:03:13,750 --> 00:03:18,650 And then if you do it at the start, you can say, fruit from 3 onwards, 55 00:03:18,650 --> 00:03:21,530 so we'll start at the 3rd and go to the end. 56 00:03:21,530 --> 00:03:24,758 And there's durian, mm, the cheese of fruit. 57 00:03:24,758 --> 00:03:27,050 So just a plain colon then, 58 00:03:27,050 --> 00:03:32,260 if you just use a colon it will give you a copy of the array. 59 00:03:32,260 --> 00:03:33,230 It's basically a copy right? 60 00:03:33,230 --> 00:03:34,510 From start to end. 61 00:03:34,510 --> 00:03:38,835 In fact this is typically used as a way of copying standard Python lists. 62 00:03:38,835 --> 00:03:43,049 So if I store that in a variable, so we'll say copied = fruit, and 63 00:03:43,049 --> 00:03:45,628 then the colon, so give me everything. 64 00:03:48,694 --> 00:03:52,360 And then we go ahead and what if we modify that copy? 65 00:03:52,360 --> 00:03:59,780 If we say copied[3], that last, that durian, we're going to set that to cheese. 66 00:03:59,780 --> 00:04:06,610 If we set that to cheese, and then we take a look at, let's see, slicing. 67 00:04:07,860 --> 00:04:14,060 A list returns a copy, and I'm gonna spell slicing correctly. 68 00:04:15,180 --> 00:04:16,575 Slicing a list returns a copy. 69 00:04:16,575 --> 00:04:19,834 So if we look here, if we look at fruit and the copied. 70 00:04:19,834 --> 00:04:22,434 We're just gonna make a new tuple of those two so 71 00:04:22,434 --> 00:04:24,849 we can look at the values next to each other. 72 00:04:24,849 --> 00:04:29,886 We'll see that fruit was not changed when we changed the copy of that. 73 00:04:29,886 --> 00:04:34,190 There are actually two different places in memory. 74 00:04:34,190 --> 00:04:37,960 Okay, and there is a third part to this slice, the step. 75 00:04:37,960 --> 00:04:41,950 Now, by default, this is 1, it moves 1 element at a time. 76 00:04:41,950 --> 00:04:45,740 But I can add a second colon, representing a step. 77 00:04:47,600 --> 00:04:48,930 Let's say that I wanted to get every other one. 78 00:04:48,930 --> 00:04:57,190 So we'll say fruit, There we go, apple, cherry, and we skipped banana and durian. 79 00:04:57,190 --> 00:05:00,960 So instead of stepping through each element one by one, we went by two, and 80 00:05:00,960 --> 00:05:03,990 you can also use a negative to walk backwards, right? 81 00:05:03,990 --> 00:05:09,416 So if we say fruit [::-1], we'll see that it runs backwards. 82 00:05:10,867 --> 00:05:13,667 And there we go, I think I'm refreshed, 83 00:05:13,667 --> 00:05:17,388 I think I'm ready to start juggling those chainsaws. 84 00:05:17,388 --> 00:05:23,470 My de facto test list go to is this one of fruit. 85 00:05:23,470 --> 00:05:27,631 When it comes to creating a numpy array to explore the most common way is to use 86 00:05:27,631 --> 00:05:29,071 a function named arange. 87 00:05:29,071 --> 00:05:32,408 Which is very similar to Python's range function, but 88 00:05:32,408 --> 00:05:34,870 instead it returns an ndarray. 89 00:05:34,870 --> 00:05:41,430 So if I say np.arange and I pass it in the ending value there. 90 00:05:41,430 --> 00:05:46,328 So it goes up to and not including 20. 91 00:05:46,328 --> 00:05:50,030 You'll see that we get all the values up to and not including 20. 92 00:05:50,030 --> 00:05:54,436 So let's go ahead and create a new practice right here. 93 00:05:54,436 --> 00:06:00,495 And we will make it np.arange, and we'll go up to our magic number, 42. 94 00:06:02,124 --> 00:06:07,994 And you can actually change the shape property on an array. 95 00:06:07,994 --> 00:06:12,948 So we can say practice.shape, and we're going to assign it, 96 00:06:12,948 --> 00:06:15,790 let's go 7 rows, 6 columns. 97 00:06:15,790 --> 00:06:19,130 So let's do that, and then let's take a look at what practice looks like. 98 00:06:20,960 --> 00:06:26,670 Awesome, so let's go ahead and get this number 13 here, lucky 13. 99 00:06:26,670 --> 00:06:30,230 So this is a two-dimensional array, or matrix. 100 00:06:30,230 --> 00:06:32,870 And really, it's just an array of arrays. 101 00:06:32,870 --> 00:06:38,636 So we first need to get this row here, so this is 0, 1, 2. 102 00:06:38,636 --> 00:06:42,262 So we have practice[2]. 103 00:06:44,327 --> 00:06:46,175 Okay, let's make sure we got it. 104 00:06:46,175 --> 00:06:47,674 Yep, and that's just an array. 105 00:06:47,674 --> 00:06:54,061 So we need to get the 0, 1, we need to get the 1th there [LAUGH]. 106 00:06:54,061 --> 00:06:59,184 There we go, and there's 13 and that is entirely too many hard brackets, 107 00:06:59,184 --> 00:07:02,650 so let's express it with just a comma, so 2, 1. 108 00:07:02,650 --> 00:07:07,050 Awesome, so as you can expect, the ndarray is Pythonic, so it too allows for slicing. 109 00:07:07,050 --> 00:07:11,320 So if we wanted to start at the 3rd row here, and go to the 5th, 110 00:07:11,320 --> 00:07:12,861 we could just do this. 111 00:07:12,861 --> 00:07:17,030 We could say practice[2:5]. 112 00:07:17,030 --> 00:07:22,926 And there we go, we've got just those rows, awesome. 113 00:07:22,926 --> 00:07:29,345 And if we wanted to just get this column here we could just put comma 3. 114 00:07:32,522 --> 00:07:37,580 Awesome, and we can also slice this column dimension. 115 00:07:37,580 --> 00:07:40,700 So let's get the 4th column until the end. 116 00:07:40,700 --> 00:07:43,980 So we'll say, 3 until the end, and there we go. 117 00:07:43,980 --> 00:07:47,750 Now we have 15, 16, 17 and then if we wanted to step 118 00:07:47,750 --> 00:07:51,165 every other column we could say 3::2. 119 00:07:52,230 --> 00:07:53,250 There we just have those two. 120 00:07:53,250 --> 00:07:57,510 Look at that, we stepped right over that column, right? 121 00:07:57,510 --> 00:07:59,597 So we skipped right over this column, we limited it. 122 00:07:59,597 --> 00:08:03,950 And then we sliced it and we got these three and then we get skipped over that. 123 00:08:03,950 --> 00:08:09,960 So we got just this 15 to 17, 21, 23, and 29, [LAUGH] pretty great, right? 124 00:08:09,960 --> 00:08:12,500 Now one thing that'll bite you 125 00:08:12,500 --> 00:08:17,410 if you don't know about it is that slices in NumPy don't return a copy. 126 00:08:17,410 --> 00:08:19,680 They return a view of our data. 127 00:08:19,680 --> 00:08:22,930 Now this is different than we saw in the standard Python list. 128 00:08:22,930 --> 00:08:23,687 So let's explores this real quick. 129 00:08:26,345 --> 00:08:30,660 I'm gonna write not_copied, and we're just gonna get everything. 130 00:08:30,660 --> 00:08:34,960 We're gonna take our practice array, and I'm gonna make a comment here for 131 00:08:34,960 --> 00:08:36,790 us later as we look over this. 132 00:08:36,790 --> 00:08:43,700 Any slicing of ndarray returns a view and not a copy. 133 00:08:45,680 --> 00:08:50,156 Okay, so I'm gonna go ahead, and I'm gonna set, not copy it. 134 00:08:50,156 --> 00:08:56,629 I'll set 0, 0, the first one there, 90201, 135 00:08:56,629 --> 00:09:02,020 and we will return practice, not_copied. 136 00:09:02,020 --> 00:09:04,704 Just so we can see them next to each other. 137 00:09:04,704 --> 00:09:08,771 And we'll see that both of them changed, and 138 00:09:08,771 --> 00:09:13,490 that is because this is a view and not a copy, right? 139 00:09:15,127 --> 00:09:17,660 It's exactly the same. 140 00:09:17,660 --> 00:09:20,198 It changed the original array practice, and 141 00:09:20,198 --> 00:09:24,015 that's because not copied is actually what is known as a data view. 142 00:09:24,015 --> 00:09:25,114 And as you can tell, 143 00:09:25,114 --> 00:09:30,020 it's kind of hard to know just by looking at the representation of the array. 144 00:09:30,020 --> 00:09:32,550 But they're views are not brand new arrays. 145 00:09:32,550 --> 00:09:36,360 So one way that you can check to see if you have a view 146 00:09:36,360 --> 00:09:38,560 is to check the base property. 147 00:09:38,560 --> 00:09:43,580 So if we look at this we can say practice.base is None. 148 00:09:46,040 --> 00:09:51,470 And that's true, but if we look at not_copied.base 149 00:09:51,470 --> 00:09:54,830 is None, we'll see that that's false because it is copied. 150 00:09:54,830 --> 00:10:01,659 And then also, that base is set, if you say not_copied.base 151 00:10:01,659 --> 00:10:06,120 is practice, we will see that that's true. 152 00:10:06,120 --> 00:10:09,160 So you can always see where it was copied from. 153 00:10:09,160 --> 00:10:14,052 And there's also a property called flags on array and it's a dictionary and 154 00:10:14,052 --> 00:10:16,360 one of the keys is called Own Data. 155 00:10:16,360 --> 00:10:18,080 So you check that as well if you wanted to. 156 00:10:18,080 --> 00:10:21,822 You can say practice.flags['OWNDATA']? 157 00:10:23,150 --> 00:10:27,270 And that one should be true and not_copied does not own the data. 158 00:10:29,347 --> 00:10:31,050 It is a view. 159 00:10:31,050 --> 00:10:34,830 So data views are important to understand and be aware of 160 00:10:34,830 --> 00:10:39,100 as you don't want to accidentally modify a structure that you didn't intend to. 161 00:10:39,100 --> 00:10:41,980 Data views are part of the trick of how quick and 162 00:10:41,980 --> 00:10:44,860 seamlessly you can arrange data in NumPy. 163 00:10:44,860 --> 00:10:47,520 So you will see them used in the wild quite a bit. 164 00:10:47,520 --> 00:10:52,150 Now the one major and not initially intuitive place where data views get 165 00:10:52,150 --> 00:10:55,180 created is in slicing, like we just saw. 166 00:10:55,180 --> 00:11:00,380 So I'd like to make sure that you recall that when you slice an array of any shape, 167 00:11:00,380 --> 00:11:03,850 that you are creating a view and not a new array, 168 00:11:03,850 --> 00:11:08,170 or copy, as happens with standard Python lists. 169 00:11:08,170 --> 00:11:10,920 A view references the same values in memory 170 00:11:10,920 --> 00:11:14,560 while a copy would be a brand new space. 171 00:11:14,560 --> 00:11:20,370 Using a slice or view is handy as you can pass only part of the array around for 172 00:11:20,370 --> 00:11:23,360 processing but not require the entire array. 173 00:11:23,360 --> 00:11:26,920 By not creating a new array we are not only saving memory but 174 00:11:26,920 --> 00:11:30,320 we're also allowing a reshaping of the array. 175 00:11:30,320 --> 00:11:33,560 We're allowing portions or slices of the array to be modified. 176 00:11:34,760 --> 00:11:37,570 Let's take a look at some more data view creating functions. 177 00:11:37,570 --> 00:11:40,590 And actually, I'm gonna take some notes on slicing and 178 00:11:40,590 --> 00:11:43,890 slicing multiple-dimensional arrays specifically. 179 00:11:43,890 --> 00:11:47,600 I'm also gonna make sure to document that little data view gotcha. 180 00:11:47,600 --> 00:11:49,780 Let's review that right after this quick break.