**Heads up!** To view this whole video, sign in with your Courses account or enroll in your free 7-day trial.
Sign In
Enroll

Preview

Start a free Courses trial

to watch this video

The box plot function gives us a summary of the spread of data grouped by a categorical variable. Another way of visualizing the distribution is by using the violin plot. The violin plot is like a mix of a box plot and a KDE. It is analogous to the box plot.

#### Box plot

`sns.catplot(kind='box', data=pokemon, x='Type', y='Attack', kind='box', aspect=2)`

#### Violin plot

`sns.catplot(kind='violin', data=pokemon, x='Type', y='Attack', kind='violin', aspect=2)`

What is the distribution of Attack for
each type?
0:00

The catplot function has another
subfamily of plots that will help us
0:04

visualize the distribution of
data with a categorical variable.
0:08

We'll make some box plots to look at that.
0:12

sns.catplot, and
0:16

this time we'll call the kind box,
0:19

data=pokemon, x='Type',
0:25

and y='Attack'.
0:31

And again, let's fix the aspect so
0:35

that the x-axis labels
are better spaced apart for
0:39

ease of reading, aspect=2.
0:44

Recall that a box plot gives us
a summary of the spread of data.
0:48

By using the catplot function,
we are able to get the spread of data for
0:53

each type of Pokemon all on one plot.
0:58

Notice that the diamond
markers represent outliers.
1:00

And where there's a line,
instead of a box plot,
1:04

that means that there's only one
observation for that type of Pokemon.
1:08

For each of these box and whisker plots,
we have a five-number summary.
1:12

The line in the middle of the box
represents the median value, or
1:17

their central tendency of Attack points.
1:21

Then we have the first and
third quartiles, and
1:24

then the whiskers, which represent
the max and minimum values.
1:28

Another way of visualizing the
distribution is by using the violin plot.
1:34

The violin plot is like a mix of a box and
whisker plot and a KDE.
1:39

Violin plots are analogous to box plots,
but recall that the KDE
1:44

lets us make inferences about the data
based on a probability curve.
1:49

We can easily make a violin
plot by copying cell 20,
1:54

and changing the time
parameter from box to violin.
1:59

Notice that the violin plot includes
part of the box and whiskers that
2:09

are found in the box plot, the median and
the first and third quartiles.
2:14

So it provides a similar
summary of the spread of data.
2:19

That's one of the joys of using Seaborn.
2:23

There are different plot types that can
give us similar findings for our data.
2:25

Let's take a look at our question again.
2:29

What is the distribution of Attack for
each type of Pokemon?
2:32

For this example, I'll answer it for
the Normal Type of Pokemon.
2:36

According to our box and whiskers plot,
2:41

the minimum Attack points
lies between 0 and 10,
2:45

let's say 5, while the maximum
goes all the way up to 110.
2:51

The median Attack points for
Normal Type Pokemon looks to be about 75.
2:58

And the first and third quartiles
3:04

look to be around 55 and 105.
3:09

Now let's record our observations
in a new markdown cell.
3:13

For the Normal Type Pokemon,
3:25

the minimum Attack is 5,
3:33

the maximum is 110,
3:40

median is 75.
3:47

The first and
3:51

third quartiles are,
3:54

say, 55 and 105,
3:59

respectively.
4:04

Awesome, now try practicing using
the Defense stats of the Pokemon.
4:14

You need to sign up for Treehouse in order to download course files.

Sign up