**Heads up!** To view this whole video, sign in with your Courses account or enroll in your free 7-day trial.
Sign In
Enroll

Preview

Start a free Courses trial

to watch this video

Determine which author in the dataset has written the most pages and more.

Welcome back.
0:00

In this video we're going to tackle
some questions around pages.
0:01

Let's add them now.
0:04

Markdown, and here we go.
0:08

Who wrote the most pages?
0:12

Another markdown,
what's an author's average page count?
0:18

And one more.
0:32

How many books have been written
with less than 200 pages?
0:35

Let's tackle who wrote the most pages.
0:45

First, we need to find all of
the unique authors in the dataset
0:52

since many authors wrote multiple books.
0:57

We're going to run books.
1:00

Where authors.
1:03

Are unique.
1:07

And you can see it gives us an array
of all the different authors.
1:12

Now we need to work out
how to get the sum for
1:17

all of the pages related
to a specific author.
1:20

Let's set this equal to a variable
right now, all_authors.
1:25

And then let's do books.loc,
1:33

where books, authors, and
1:38

let's just do Stephen King.
1:43

Cuz I know he's a relatively
famous author, and
1:48

I know he's written multiple books, so
I feel like he's a good example to use.
1:52

Num_pages.
1:59

Okay, so we can see all of the IDs,
and then all of the page counts, or
2:03

the number of pages for all of the books
that have the author of Stephen King.
2:08

So it sorted the books by all
the books that have authors that
2:13

equals Stephen King.
2:17

And it's only returning
the number of pages column.
2:19

So we can see it's quite a lot.
2:23

Now, a fun thing we can do here at the end
that makes our lives a lot easier.
2:25

We just add .sum, and
it will sum it all up for us.
2:29

So we get a total of 1,800.
2:34

No sorry, we get a total of
18,219 pages for Stephen King.
2:38

Now, let's think this through.
2:46

We know how to get all of our authors, and
2:48

we know how to get a single author's
page count to see who has the most.
2:51

So we're going to need to compare
all of the author's page totals, and
2:57

then see who has the highest value.
3:02

There are a few different
ways to tackle this.
3:05

One way is to create a max variable,
3:07

I'm going to put it up here at the top,
and set it equal to zero.
3:11

And then we can loop through our
authors to calculate their page total.
3:18

Compare it to this max value.
3:23

And if it's larger,
then we can update the value.
3:26

And let's also hold the author's name
as well so we know who we end up with.
3:30

And I'm going to do top_author, and
I'm going to set it equal to None for
3:35

now so
that we can set it as an author's name.
3:40

So let's turn this into a loop.
3:44

So we need to get all of our authors and
now we need to loop through them all.
3:47

So for author in all_authors.
3:52

We're going to do, tab this over, and
3:58

this is going to be our
total_pages equals.
4:02

And instead of Stephen King,
we need to pass in our author so
4:08

that we get to each author
as it loops through.
4:13

Awesome, and
then next we need to check if their
4:22

total_pages is greater than
the current max value.
4:27

If it is then, the max needs to
now be set equal to total pages so
4:35

that they now have the top spot.
4:40

And our top author now is going
to be set equal to that author.
4:45

And then let's print out the max value.
4:53

And let's print out the author or
the top_author.
4:57

Actually, it doesn't matter cuz
they will be the same thing.
5:01

And then at the end, outside of our for
5:05

loop, I'm going to print the max again,
5:10

and print the author.
5:15

And this is just so
we can see as the for loop is running,
5:20

which authors kind of
take over the top spot,
5:24

the leaderboard and then at the end,
who came out on top.
5:28

And I think something, I think this
one I need to do as top_author.
5:34

That was my mistake.
5:40

Let's run it again.
5:42

Okay, so we can see it's running and
we got J.K Rowling, and
5:44

then another form of J.K Rowling cuz
sometimes it's not splitting them up but
5:47

that's okay for
what we're doing right now.
5:52

And then we got J.R Tolkien, and
then we got Stephen King, and
5:55

then Stephen King ended up being
our top author with 18,219 pages.
5:59

Awesome.
6:04

Now our next question,
what's an author's average page count?
6:05

We can use the same count code from above.
6:09

So.
6:12

We got our total pages.
6:15

I'm just going to copy this.
6:17

Row here, and paste it.
6:21

And I'm going to do their pages.
6:25

And then the same thing as before,
6:29

I'm just going to use
Stephen King as our example.
6:31

Just cuz he just won the top
number of pages written.
6:37

And then so we got them their number of
pages now we need to know the number of
6:45

books that they've written.
6:49

So their books, we can do
6:51

books where the authors is
6:56

equal to Stephen King.
7:02

And we can do.
7:07

Okay, so we can wrap this in
a parentheses and then do value_counts.
7:11

And looks like we have some trues and
false for when that is equal.
7:19

And let's return just this first value.
7:25

So we can see that they have 40
bucks where the author ends up being
7:27

Stephen King.
7:32

So this will be their books.
7:35

Now for a bit of math.
7:41

Their average_pages is
7:43

their pages divided by their books.
7:48

And then we can print their average pages,
7:56

and we get about 455 pages per book.
8:02

And then all you have to do if you wanna
see a different author is just switch out
8:10

the author's name to someone else.
8:14

I could do J.R.R Tolkien,
make sure I spell that right.
8:16

I did not, I-E-N.
8:23

And make sure I have the right
number of periods and things, cool.
8:27

Copy it, and paste it,
and run it, 737 pages.
8:33

Wow.
8:40

I can't imagine writing
that many pages for novel.
8:41

Lastly, we have how many books have
been written with less than 200 pages?
8:45

I wanna give you this to
try on your own first.
8:50

Pause me and see what you come up with,
then unpause me and see what I wrote.
8:53

Okay, so we need to filter our books
9:00

where the number of
pages is less than 200.
9:06

Cool.
9:14

And then to figure out how many there are,
9:15

we can actually just use
len to get the length.
9:18

And it looks like there are 2,898 books
with less than 200 pages in our dataset.
9:23

Nice job, Pythonistas.
9:30

You need to sign up for Treehouse in order to download course files.

Sign up