**Heads up!** To view this whole video, sign in with your Courses account or enroll in your free 7-day trial.
Sign In
Enroll

Preview

Start a free Courses trial

to watch this video

Use Pandas to import the dataset and call .head() to get a preview of the data.

#### Importing the dataset:

```
pokemon = pd.read_csv('pokemon_40.csv')
pokemon.head()
```

#### Questions about the data:

What is the relationship between Attack and HP?

What is the distribution of Attack?

What is the relationship between Attack and Type?

What is the distribution of Attack for each Type?

What is the average (mean) Attack for each Type?

What is the count of Pokemon for each Type?

Now we will import our data set.
0:00

Use Pandas to import the data set and
call head to get a preview of the data.
0:04

In cell 3, we'll say pokemon
0:09

= pd.read_csv('pokemon_40.csv').
0:14

Next line, pokemon.head(), run the cell.
0:26

Let's examine together the data
given to us by the head function
0:35

to become familiar with it.
0:40

Each row represents a Pokemon.
0:42

Pokemon is a game where
players battle monsters.
0:49

Every monster or Pokemon has a name,
a categorical type,
0:53

like water, grass or electric,
and some numerical statistics.
0:58

These numerical statistics include HP,
1:05

which stands for health points,
attack, and defense.
1:09

There are other statistics, too,
but this data set is simplified so
1:14

that even if you are not
familiar with the game,
1:18

you will be able to perform
some statistical analysis.
1:21

If you'd like to manually
examine more of the data set,
1:25

you can open pokemon_40.csv in a new
tab to look at all 40 observations.
1:30

Let's ask some questions
about our data set.
1:41

In this stage of the course,
1:45

I will be asking questions about
the attack statistics of these Pokemon.
1:46

Then I will perform exploratory
data analysis with different
1:51

kinds of plots in order to find
answers to these questions.
1:56

After each plot,
I'll challenge you to explore the data for
2:00

the Pokemon's defense statistics.
2:04

Here are my initial questions.
2:07

What is the relationship
between Attack and HP?
2:11

What is the distribution of Attack?
2:16

What is the relationship
between Attack and Type?
2:19

What is the distribution of Attack for
each Type?
2:23

What is the average, or
mean, Attack for each Type?
2:28

And what is the count of Pokemon for
each Type?
2:33

Notice that a lot of these questions ask
about relationships between numerical and
2:38

categorical data.
2:44

Categorical data means data that
is words instead of numbers.
2:45

The categorical data for
these questions is the type of Pokemon.
2:49

This is one of the main
strengths of Seaborn.
2:55

Unlike Matplotlib, which is optimized for
2:58

creating plots with
strictly numerical data,
3:01

we can use Seaborn to analyze data that
has both categorical and numerical data.
3:04

You need to sign up for Treehouse in order to download course files.

Sign up