Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Video Player
00:00
00:00
00:00
- 2x 2x
- 1.75x 1.75x
- 1.5x 1.5x
- 1.25x 1.25x
- 1.1x 1.1x
- 1x 1x
- 0.75x 0.75x
- 0.5x 0.5x
You can create an array of booleans and then use that to index into your array. Let's use this to filter our values.
Learn more
My Notes for Indexing
## Creation
* You can create a random but bound grouping of values using the `np.random` package.
* `RandomState` lets you seed your randomness in a way that is repeatable.
* You can append a row in a couple of ways
* You can use the `np.append` method. Make sure the new row is the same shape.
* You can create/reassign a new array by including the existing array as part of the iterable in creation.
## Indexing
* You can use an indexing shortcut by separating dimensions with a comma.
* You can index using a `list` or `np.array`. Values will be pulled out at that specific index. This is known as fancy indexing.
* Resulting array shape matches the index array layout. Be careful to distinguish between the tuple shortcut and fancy indexing.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
All right, before we get started here,
0:00
I thought I'd share my
notes as a quick refresher.
0:02
We talked about creation and we saw a new
way to build a random grouping of numbers.
0:05
And we used RandomState to let us seed the
randomness in a way that was repeatable.
0:10
I have the same random values as you do.
0:14
That is super handy.
0:16
And we also learned that you can
append a row in a couple of ways.
0:18
There's even more that
we haven't seen here.
0:21
You can append, use the np.append method.
0:23
You need to make sure
that it's the same shape.
0:26
Remember we did that little hack
where we wrapped it in a list.
0:28
And you can create and
0:31
reassign a new array by including existing
arrays as part of the iterable right?
0:33
So you can throw the new
array in there and
0:37
reassign it cuz you can't change the size.
0:39
And we looked at indexing.
0:42
And there's that nice indexing shortcut
for the multidimensional array,
0:43
remember where you can use the comma.
0:46
So you can say like 3,4 and
that's really row three, column four.
0:48
And instead of having to use the hard
brackets you can use the commas and
0:53
it creates a tuple automatically.
0:55
And you can also index using a list or
another np array and the values will be
0:57
pulled out in that specific index,
and that's known as fancy indexing.
1:02
The resulting array comes back and
it's the same shape as what you asked for.
1:06
But it's very important to
remember to look for lists or
1:10
arrays versus just using
a tuple with a comma.
1:13
All right, so now where were we?
1:18
All right,
we wanted to look at our study log and
1:20
find hours that were just about an hour,
but not quite.
1:23
So let's get back down into
where we did all that work, so.
1:28
So here's the study minutes,
I'm gonna get rid of this last one here.
1:31
To delete a cell, Escape, and then D, D.
1:35
Here we go, so we have our study
minutes array is all written out.
1:38
And remember we were using this fake_log.
1:41
So let's start with that fake_log,
1:44
cuz that definitely has some
values that we know are under 60.
1:45
And remember,
that's what we're looking for
1:50
because they don't count
towards the challenge.
1:51
The concept being there, that is if we
saw those ones that almost made it,
1:53
maybe they'd provide a little inspiration
for us to stick with it for that next day.
1:58
So the np array object is pretty powerful.
2:02
Just about every comparison
operator has been overridden.
2:07
So let's take that fake_log object that
we're using, cuz it's one-dimensional.
2:11
So it's a one-dimensional array, and
it's filled with 100 random values.
2:15
So we'll do fake_log, and check this out.
2:19
I am looking for
values that are less than 60,
2:25
because you know there
are 60 minutes in an hour.
2:27
So I can just write that.
2:29
So fake_log < 60.
2:31
And what will happen is,
Is it's not defined,
2:33
this might happen to you sometimes.
2:38
So this is good,
I'm glad that this happened.
2:40
So what we can do is we can go ahead and
run Kernel, and say Restart & Run All.
2:41
And that popped up our help
cuz we left the help up there.
2:52
So I'm gonna go ahead and close this.
2:54
And so
what happened is we've got this fake_log.
2:55
And what will happen is you
remember that we have these values.
2:59
So the first one that's there is
this fourth, so the fourth value.
3:03
So if we go False, False, False, True and
then looks like again at the eighth there,
3:07
so there's some more false,
false, false, true.
3:12
So what's happening is that it's comparing
every single one of these values and
3:14
it's showing us true where it is.
3:19
Every value is represented.
3:21
And any place that we see a True,
3:23
it is means that it is true
that it's less than 60.
3:25
And that probably doesn't
seem all that handy.
3:29
Well, that is until you find out
that you can do fancy indexing with
3:32
a Boolean array.
3:36
The way that it works is that as long
as the Boolean array lines up with your
3:38
other array,
any value where True exists will be kept.
3:43
So, here check this out.
3:46
So this is what we want, right?
3:48
We want to say,
anything from the fake_log,
3:49
we will use that Boolean
array as a fancy index.
3:55
There we go, we pulled it all
out every value that was True.
4:00
That's exactly what we are looking for,
right, these are all not quite 60.
4:05
Pretty cool, right?
4:10
We did that filtering all without a loop.
4:11
You could totally accomplish this same
thing by saying something like a list
4:14
comprehension, or even something similar
like this, this really simple loop.
4:19
So say results equals this,
let's iterate through each of the values.
4:22
So for value in fake_log, if, here we go.
4:26
If the value is less than 60, then
we're gonna say results.append(value).
4:31
And then just to get back exactly
the same thing we'll just use it.
4:38
We'll say np.array(results), right?
4:41
So there's a loop that we had to write,
and obviously we got back the same thing.
4:43
But using a Boolean array index, is orders
of magnitudes faster than this for loop?
4:49
And look at the code difference too.
4:55
Something you might be wondering is what
happens with multidimensional arrays,
4:57
like our study in minute array.
5:02
Well the good news is, it just works.
5:04
So if we say study minutes less than 60,
you'll see back
5:07
that we get an array, a Boolean array that
is of the exact same shape as our array.
5:12
So that's 3 by 100.
5:17
And of course,
we can use that array as an index.
5:21
So let's do that as well, so
we can say study_minutes,
5:26
where the study_minutes is less than 60.
5:30
Boom, now notice that we're
returned a one dimensional array.
5:35
Not our original three dimensional array,
it's all of the values that match.
5:42
Now we could rewrite this as a nested for
5:46
loop of the same time type
that we did before, right.
5:48
Like we could loop through each round and
then loops through each day and
5:51
adds into our results.
5:55
But we don't need to do that because
this is done all for us without a loop.
5:56
That's kind of gross that's
a bunch of zeros, right?
6:02
If we're looking to motivate ourselves and
we really don't wanna see these zeros.
6:06
What we really wanna see is anything
that's less than 60 minutes and
6:10
greater than 0.
6:16
That gets minutes from days where
we worked a little bit at least.
6:17
So we want to make two
Boolean index arrays.
6:21
Like we wanna make this
study_minutes array, this one.
6:24
We wanna make that array,
the study_minutes where it's less than 60.
6:28
And we also wanna have another
one where the index array is
6:32
study_minutes greater than 0.
6:36
And then we actually want to have the
results where it's a combination of those
6:39
added together.
6:43
You could actually compare arrays
together element by element,
6:44
which is what we want to do.
6:48
So, I'm gonna come back here.
6:50
Let's just manually, we'll go ahead and
6:51
we'll manually create an array,
a Boolean array of False, True, True.
6:54
And to compare, we used the bit wise
operator for and, the ampersand.
7:00
Now this is not the and
keyword, it's an ampersand.
7:07
Now common mistake is, [LAUGH] to
forget and use the and keyword, and
7:11
we'll explore what happens
in here in a bit about that.
7:14
And then I'll create another Boolean array
that we can compare it to, so np.array,
7:17
and we'll put in True, False, True.
7:21
So what happens is we get a brand new
array with each element added together.
7:27
So remember,
when you're checking Boolean logic,
7:33
both sides need to be true
to be considered true.
7:35
So, looking here we have False and
True, and that's False,
7:39
and then we have True and False.
7:44
And that of course is False as well
because they're not both True, and
7:46
then we have True and
True, definitely True.
7:50
So if we go ahead and we run this,
we'll see that we get back a single
7:54
array with the values anded together,
False, False, True, just like we saw.
7:59
So we could use this result as
a Boolean index array, right?
8:04
Do you see how we can just build
the Boolean index array together?
8:10
Values that we want to chain together with
all of other conditions in a series of
8:13
ands and ors?
8:17
Before we use it, I do wanna show you what
happens if you forget to use the bit wise
8:18
and, as the resulting error is
a little confusing at first.
8:22
So depending on how times you
have joined logical expressions,
8:25
your muscle memory might actually
accidentally type the and key word here.
8:28
So let's do that,
let's put this last and key word here.
8:33
Yak, ValueError and
8:34
it's saying the truth value of an array
with one more elements is ambiguous.
8:39
So, what it's trying to do is it's trying
to figure out a truthiness of this, and
8:44
that's what and does.
8:49
It creates a truthiness, and if it's
assuming that we wanna have a scalar
8:51
value, which is not what we want,
we wanna compare element by element.
8:55
So if you did wanna get a scalar value,
9:00
if you wanted to see that everything
was true, you would use a.all and
9:02
that returns a Boolean or
any if there's any true in there at all.
9:05
All that to say,
just use bit wise operation.
9:08
So just go ahead,
use a bit wise operation.
9:11
I just thought I'd preemptively warn
you about this, as it happens a lot,
9:14
more in the teacher's notes.
9:18
[LAUGH] So let's build up our index,
so we wanna have study_minutes,
9:20
Where the study_minutes,
9:26
Are < 60 & study_minutes > 0,
9:31
right, that's what we're looking for.
9:35
But we wanna take caution to make sure
that we're careful about the order of
9:41
operations.
9:44
This & here is stronger
than the less than.
9:46
So what we're going to get is 60 and
minutes.
9:49
And again, we're gonna run into
the truthy problem that we saw before.
9:52
So, we don't want that.
9:57
So let's put parenthesis in place to
just to make sure we've got the order
9:58
of operations correct.
10:02
And voila, there we have it.
10:08
A brand new array containing entries
that represent values from our
10:11
study_minutes array,
that are less than 60 and greater than 0.
10:16
That's pretty cool, right?
10:21
And you can see, you can pretty
much read that more or less, right?
10:22
You'll get used to remembering
to use the parens and
10:25
ampersand, but
I guarantee you'll forget sometimes.
10:28
Now, one thing we really
should consider is this.
10:32
Even though we did those minutes, these
are minutes here that we spent some time.
10:36
They don't actually count for
completing the challenge.
10:40
The challenge is to do
at least an hour a day.
10:42
So in reality,
we really should set all of these to zero.
10:45
If deleting these minutes doesn't
motivate me, I don't know what will,
10:51
especially this 58 minutes.
10:55
Now even though this index statement,
this study_minutes,
10:59
Study_minutes, < 60,
11:06
now even though that
creates a brand new array,
11:09
if you assign to it, you can do an update.
11:14
And if we look now,
we look at our third row there,
11:19
we'll see that we add some zeros
in where they were not before.
11:24
You guys look at those,
11:30
all that time didn't count
because I didn't reach that hour.
11:31
No, now of course that
time did actually count.
11:35
I was learning, but
it didn't count towards the challenge.
11:39
And I'll tell you what, this 100 days
of code challenge totally motivates me.
11:43
So losing that time definitely will
keep me focused in the future.
11:47
It reminds me that I just
need to stick with it,
11:50
I want to complete this challenge.
11:53
Speaking of challenges, I'd like to again
challenge you to capture your thoughts
11:56
on Boolean array indexing
in your notebook.
12:00
Remember to think through the possible
gotchas that we walked through.
12:03
Like, accidentally using the and
keyword or forgetting to use parentheses?
12:06
If you've ever done SQL programming
before, that might have felt familiar.
12:10
Capture those thoughts a bit.
12:14
Also, now is a good time to take
a moment and review your notebook.
12:16
Is everything in there clear?
12:20
If not, please hit up the community and
ask your questions.
12:21
If you are looking to
solidify your knowledge,
12:25
I highly recommend attempting
to answer some else's questions.
12:27
I can't recommend it enough,
by taking the time to explain a concept,
12:31
you will uncover new knowledge.
12:35
Give it a shoot and won't disappoint.
12:37
So far what we've been doing
is returning a new array.
12:39
But you can actually return a view
of the data that you can manipulate.
12:43
Let's take a look at data views and
some more powerful slicing features next.
12:46
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up