Bummer! You have been redirected as the page you requested could not be found.
Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Well done!
You have completed Introduction to Data Visualization with Matplotlib!
You have completed Introduction to Data Visualization with Matplotlib!
Preview
Let’s utilize a scatter plot to see what correlations if any, there are between the sepal length and width based on the variety of iris.
Color Dict
colors = {"Iris-setosa": "#2B5B84", "Iris-versicolor": "g", "Iris-virginica": "purple"}
Correlations
- Positive Correlation: as one variable increases so does the other. Height and shoe size are an example; as one's height increases so does the shoe size.
- Negative Correlation: as one variable increases, the other decreases. Time spent studying and time spent on video games are negatively correlated; as your time studying increases, time spent on video games decreases.
- No Correlation: there is no apparent relationship between the variables. Video game scores and shoe size appear to have no correlation; as one increases, the other one is not affected.
Further Reading
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
We left off with our data in a list,
just waiting to be used.
0:00
Let's utilize a scatter plot to
see what correlations, if any,
0:04
there are between the sepal length and
width, based on the variety of aggregates.
0:07
[SOUND] Recall that scatter plots are used
to show how much one variable is impacted
0:12
by another, or its correlation.
0:16
We use scatter plots to show
relationships between values.
0:18
In this case, our sepal length and width.
0:22
The scatter plot allows us to quickly
visualize the distribution of the data and
0:24
notice any outliers.
0:28
We can see if there’s a positive,
negative, or
0:30
nonexistent correlation between our
data based on the scatter plot results.
0:33
Let's jump back into our Python code and
develop our chart.
0:38
Let's get started from
the previous video's code and
0:42
rename name this notebook iris_scatter.
0:46
Since we'll be starting
our work with plots now,
0:58
we'll need to have our matplotlib.pyplot
import to our project.
1:02
matplotlib.pyplot as plt.
1:06
Let's create a dict for
1:12
our marker colors that we can use as
we loop through our list information.
1:13
The colors allow us to see the different
iris classes more easily, and
1:18
visualize a third
variable in our data set.
1:22
I'll paste that dict in here.
1:25
I have included a copy of
it in the teacher's notes.
1:27
Our colors then are the blue hue for
setosa.
1:38
Green, that short code, for versicolor.
1:42
And purple for virginica.
1:45
Our list of iris data also includes an
extra item that we don't need at the end,
1:48
so let's pop that off.
1:52
Now we'll want to loop through our array,
and assign our x and
1:57
y-values to the sepal length and width.
2:01
These are located in the first and
second columns of our array, respectively.
2:04
We can use a function in the itertools
library called groupby that allows us to
2:09
easily do that.
2:14
Let's add that import first and
I'll show you the code.
2:15
From itertools import groupby.
2:23
If you haven't used itertools,
it is a module that provides functions for
2:30
efficient looping.
2:34
Check the teacher's notes for
additional information.
2:36
So to start this, species and
2:38
group in groupby,
2:45
Group is a generator, so
you can only go over it one time.
2:54
And then we'll get our sepal length.
3:09
It's gonna be the float value.
3:16
Sepal widths, similar.
3:32
Then we assign that to plt.scatter.
3:51
Sepal_lengths, sepal_widths for
our y-value there.
3:55
We'll assign it a marker size of 10.
4:05
C for the colors will come from our
colors dict, and grab the species.
4:10
And we'll label based on species as well.
4:17
Now, before we call plt.show,
let's add a plot title, axes labels,
4:22
and legend to our chart here
to add context to our data.
4:27
This is an important thing to remember.
4:31
Always label your axes,
legends, and charts.
4:34
Plt.title, Fisher's Iris Data Set.
4:38
We'll give that a fontsize of 12.
4:47
Bring that up a little bit.
4:54
Our xlabel.
4:58
These are our sepal
lengths in centimeters.
5:01
We’ll assign that a fontsize of 10.
5:07
For our ylabel,
these are our sepal widths.
5:11
Again, in centimeters, and
we'll give that a fontsize of 10 as well.
5:16
We'll call plt.legend, And
5:27
we'll give this a location
in the upper right.
5:31
Here we are setting the legend location
to be displayed in the upper right
5:37
of the chart.
5:40
But we could display it in the upper left,
upper center, bottom left, etc.
5:41
Since there aren't any data points
being displayed in the upper right,
5:46
that seems like a good position.
5:50
Now we just call plt.show.
5:52
And run our cell.
5:58
We can see some patterns
here in our sepal data.
6:05
Iris-setosa is a pretty good grouping in
the upper left quadrant of our chart.
6:07
There are some outliers though.
6:12
The other two varieties seem
to be clumped together and
6:14
intermixed with some
even greater outliers.
6:17
Our plot looks a bit small here, though.
6:20
Let's assign a size to our figure
to make it a bit easier to see.
6:22
We do that, Go up here,
6:26
right under input_file,
that's kind of a standard spot for it.
6:30
We attach something to the figure object,
figsize.
6:36
7.5, 4.25 seems to work pretty well,
and we can run our cell again.
6:44
There, that's better.
6:54
From an analysis standpoint, we could draw
some conclusions based on this chart.
6:57
It appears that all three iris
varieties have a positive correlation
7:01
between sepal length and width.
7:05
Iris-setosa has a better defined positive
correlation than the other varieties.
7:06
Scatter plots are, of course,
only one way to explore our data.
7:13
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up