Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
Allow me to introduce you to a powerful new data structure, the array.
Learn More
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Arguably, one of the most important things
to get familiar with in NumPy is the array
0:01
data structure that it introduces.
0:05
Arrays are around in most
programming languages so
0:08
you probably heard the term
floating around before.
0:10
Now don't worry if you haven't,
0:12
you'll be super familiar with
the concept here shortly.
0:13
Python actually has an implementation
in its standard library.
0:16
Arrays are a way of organizing your data.
0:19
An array is a data structure that allows
you to store multiple items in a single
0:21
variable.
0:26
Which yes wise Python-ista, sounds
an awful lot like a list, doesn't it?
0:26
But there are some crucial
differences between a list and
0:31
an array, the array is much
more restrictive than a list.
0:34
Which, I realize,
doesn't sound like a great selling point.
0:38
But, in fact,
it's actually these restrictions that make
0:41
arrays the go to solution for
most Python projects involving data.
0:43
The list, as you know,
can contain just about anything.
0:48
And by anything, I mean the type of
elements in there don't matter, right?
0:51
A single list can contain booleans, and
strings, and any random old object.
0:56
Arrays aren't like that.
1:00
Every single item in the array
must be of the same data type.
1:02
That's the first major difference
between the list and an array.
1:05
Now, the second one is that
the array's length cannot be changed.
1:09
Its size is immutable.
1:12
You can't add items to the array.
1:14
Or remove from it.
1:16
You can change the value of individual
items, but the size cannot be changed.
1:17
And it must be defined when it's created.
1:22
When we work in these higher
level languages like Python,
1:24
we don't really need to think about how
things are working under the covers.
1:27
That's because the language
does a wonderful job
1:31
of abstracting that away from us.
1:34
Right?
1:36
What do you say we take a peek
behind the curtain real quick?
1:37
When I create a new variable named age and
set it to say 29, I wish.
1:41
Well what really has been abstracted away,
is that Python has to go to the memory
1:46
on the computer and
find some free space to store that number.
1:51
Storing that number is, actually,
another abstraction, isn't it?
1:55
It's really just binary, right?
1:58
It's a series of ones and
zeros that gets put in there.
2:00
And then later, when we go to access it
again, our variable name references that
2:02
memory location so it can be used to
go to and read that value from memory.
2:07
And because it knows the size required for
its data type, it can grab the right
2:11
version of 1s and 0s, to have it represent
the original value in our program, 29.
2:16
I feel like I have aged
a year just explaining that.
2:21
There are lower level languages like C,
2:24
where you actually have to do
all the memory management.
2:26
You have to go find the space yourself,
put the value in there and
2:29
clean it up when you
don't need it anymore.
2:32
Now I'm certain that you could
totally do that if you had to.
2:34
But this work has been
abstracted away from you, so
2:37
you can focus on more important things.
2:40
One thing you probably don't think about
often is how awesome these Python lists
2:43
actually are.
2:46
You create them, you pin stuff to them,
you remove stuff, you insert stuff.
2:48
That stuff can be any size, so every time
something is added, the same flow of
2:51
finding the proper space with enough room
to hold that data needs to be located.
2:56
And there's even more to
the list abstraction here.
3:01
Each of those whatever size objects
all can be iterated through, or
3:03
accessed by index.
3:08
It's pretty impressive when you stop
to think about what we've been taking
3:09
for granted.
3:12
There's a lot of effort that is hidden
behind that abstraction to make things
3:13
seamless.
3:17
That effort actually does have a cost.
3:18
Namely, speed.
3:21
It's a lot of overhead, and
3:23
it wouldn't be needed if only you knew
how much space each element required.
3:24
As well as the total number of elements
that would always remain the same.
3:29
Hey, those restrictions sound
pretty familiar, don't they?
3:34
Those are the restrictions I was
talking about that arrays have.
3:37
When the array is defined or
declared, the only type of
3:41
element that it can store is defined, as
well as how many elements will ever exist.
3:44
Therefore, the space to store the whole
array structure is known at creation time.
3:49
What this does,
3:54
is it allows for each element to be
stored next to the following one.
3:55
The elements are contiguous, or
3:59
one right after the other,
with no space in between them.
4:01
This makes finding exactly where an item
lives in memory a very easy math equation.
4:04
Much quicker than accessing
an item in a list.
4:09
Let's go build an array.
4:13
Okay, I'm gonna give us
a little more space.
4:15
I'm gonna click view, and
I'm gonna toggle the header.
4:18
That feels better.
4:20
Let's start by building a list that
we can use for a quick refresher.
4:22
So I want to to store some
floating point numbers, okay?
4:26
How about we store my
high school GPA scores?
4:32
For those of you out there
outside where I grow up,
4:36
which is the United States,
GPA stands for Grade Point Average.
4:39
And we are or at least at the time,
a little over obsessed with them.
4:43
Now it's an average of all your grades
in all subjects on a scale from one,
4:49
being the lowest,
to four being the highest.
4:54
Okay so let's declare a new list,
4:57
called gpas_as_list.
5:01
And we'll use a hard bracket literal,
for our list.
5:06
I'm doing this for high school.
5:10
Now this is the part of
school before college.
5:12
It's called secondary school elsewhere.
5:14
Okay, so what did I earn.
5:16
I don't remember really, I've forgotten
it cuz I didn't really use these things.
5:19
So our brains tend to forget information
that we don't use, by design.
5:25
Well, I know I started off strong.
5:30
So I'm going to give myself a,
5:32
let's give me a 4.0.
5:36
And then let's see.
5:40
Math starting to get a little weird.
5:41
Remember geometry and what not.
5:43
Man, and chemistry.
5:45
I must have dropped a bit.
5:47
Let's do 3.2.
5:49
Well shucks, I'll give myself some more
precision there, so let's do 3.286.
5:52
That's better.
5:57
And then we had some trigonometry
It was still pretty rough.
5:59
But I started building
out that math foundation.
6:04
So let's say 3.5.
6:06
Let's not even talk about senior
year's physics and calculus.
6:09
Okay, fine, I'll add it, and
6:14
I can because a list size can be mutated,
right?
6:17
So I can just append to it.
6:23
Let's say 4.0.
6:26
I did have a hard time with calculus, but
eventually it all started making sense.
6:29
Once I got that graphing
calculator working.
6:34
And I haven't thought about
that sweet TI-82 since then.
6:37
I'm gonna make a little note
here about this list, too.
6:41
So we'll put a comment in here.
6:43
We'll say,
can have elements appended to it.
6:45
So on top of appending, I can also
insert into the list wherever I want.
6:51
And, also, we can have multiple
data types in this list, right?
6:57
So, let's do gpas_as_list.insert
because we can put them anywhere.
7:05
And at the first position,
we're gonna insert Whatevs,
7:10
cuz that's a string in there,
so that's kinda strange.
7:15
And I can pop it right out, right?
7:18
So, I can say, Can have items removed.
7:19
So we can say gpas_as_list.pop(1).
7:23
And I'm gonna run this and we'll see
that we got back this last line here,
7:29
this pop, whatevs is right.
7:34
So we can stick whatever in there.
7:36
So let's make sure we got everything
looking like we want, awesome.
7:38
Now let's create a NumPy array.
7:45
A common way to create an array is to
pass the array function an iterable,
7:48
and we have one of those.
7:53
So let's go ahead.
7:54
We'll make a brand new array, and
7:55
we're going to call it gpas =
8:00
np.array(gpas_as_list).
8:04
So now that we have one,
let's see what we can do with it.
8:09
One of my favorite things
about Jupiter Notebooks,
8:13
is that it lets you peep
the documentation on an object.
8:15
If I just say ?gpas.
8:18
Nice, so this is an ndarray.
8:22
Some documentation here, here's some
parameters, here's dtype, the data type.
8:25
And here's some attributes for it.
8:31
So dtype, dtype describes the format
of the elements in the array.
8:33
Awesome, so let's just go ahead and
see what we have going on here.
8:37
So if we say gpas.dtype Float64.
8:40
64 there, stands for the number of bits.
8:45
Remember, the 1s and 0s that
are required to store the datatype.
8:49
Speaking of which,
looks like we also have itemsize,
8:54
the memory use of each
array element in bytes.
8:59
So we can say gpas.itemsize.
9:01
8, 8 bytes, that makes sense, right?
9:07
8 bits are in 1 byte,
in our datatype 64 takes 64 bits.
9:11
So, 8 bytes is 64 bits.
9:18
8 times 8.
9:20
All right, so
what else do we have in here?
9:22
Size, that's a great one,
so let's take a look.
9:28
So we have gpas.size, 4.
9:31
And NumPy does a pretty
good job of being Pythonic.
9:35
Which means most objects just
work like you think they should.
9:39
Like if I wanted to know the length,
9:42
I should just be able to use
the len function, right?
9:44
So we could say len(gpas).
9:47
And while not the best naming,
9:50
we can also get the number of bytes
that are required for the data.
9:52
So if I say gpas.nbytes, we'll see 32.
9:56
Which of course is just the length,
4, times the item size, which is 8.
10:03
So 32, but it's still nice to have.
10:10
Okay, I just threw a ton
of information at you.
10:14
So let's go ahead and
take a quick breather.
10:19
But before we do that, let's do this.
10:22
Let's do a little exercise of
writing up a nice summary.
10:25
You know,
try to capture our train of thought.
10:29
We found that reflecting
on what you just learned
10:31
will help you when you
come back to it later.
10:35
I'm gonna come up here, so I'm gonna
get in command mode by pressing escape.
10:38
See how it's blue?
10:41
I'm going to come up here, and I'm going
to go right above this gpas list here.
10:42
And I'm going to make a new cell.
10:47
So I'm going to make a new
cell above this, so A.
10:50
I'm going to press the A key,
there's a new cell above it.
10:53
I'm going to press Enter, and
I'm going to switch this to markdown.
10:55
And I'm going to write down some notes
here, so I'll say ## for second heading.
11:00
We'll say Differences between lists and
NumPy Arrays.
11:06
And during the break,
I'm gonna do some reflection.
11:14
And I'll share with you what I
jotted down in the next video.
11:17
But I'd like for
you to give that a shot, too.
11:21
Treat it free-form,
just jot down some thoughts.
11:23
And like I said, I'll share mine
with you in the next video.
11:26
Sound good?
11:29
Okay, so do that.
11:30
And when we come back,
we'll start our 100 Days of Code Study Log.
11:31
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up