Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
Review one solution to the data cleaning challenge.
This video doesn't have any notes.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Ready to see a solution?
0:00
Let's tackle this.
0:02
You can see I created a single function.
0:04
I called it clean_data and
I'm passing in our data.
0:06
Don't forget at the top here,
0:10
I'm already importing our data from
data.pi at the top for us already.
0:12
I created a new list called cleaned and
0:16
I'm going to return it when all this stuff
is completed and everything is cleaned.
0:19
And then I'm just calling the function
while passing in our data and
0:24
printing that out to the console so
0:29
we can make sure that we're
doing everything correctly.
0:32
So the first thing I'm going
to do is our data is a list.
0:36
So if I loop through all
the items in the list,
0:41
I'm going to get each
individual dictionary.
0:44
So let's see what that looks like.
0:47
I'm gonna say for user in data.
0:49
And then just to see how things go,
0:52
I'm going to print Users so
we can see this in the terminal and
0:55
I'm gonna pull this all the way up just so
we have plenty of space.
1:00
This is app.py, there we go.
1:06
Okay, so you can see, [SOUND] I'm
getting each individual dictionary
1:09
that's inside of our list and then I'm
returning the clean list at the end,
1:15
which is why there's
an empty list at the bottom.
1:21
So if I hit clear,
I can pull this back down.
1:25
Okay, so we know that we're
accessing each individual user, so
1:31
now let's go through and
let's fix each one.
1:35
I'm going to create a fixed
variable instead of equal, oops,
1:39
now list an empty dictionary,
make sure to use those curly brackets.
1:42
And then I'm going to go through and I'm
just gonna go through from top to bottom.
1:47
So if I look at our data.py, I'm gonna
do email name, date_joined, admin,
1:51
id, I'm gonna do it in the same order.
1:54
That just makes sense to me.
1:57
So first, email is one of
the ones that we're not changing.
1:59
So I'm gonna do fixed.
2:04
We're gonna create a new key called email,
and
2:06
we're gonna set that equal
to the value of user["email"].
2:11
And now just to show you how this works,
I'm gonna copy this,
2:17
just so
you remember how dictionaries work.
2:21
And I'm gonna print out all of
our values for the email address.
2:25
So remember, this is gonna grab each user,
which is from each dictionary.
2:29
And it's going to grab the email key,
and it's gonna return us the value.
2:37
So let me run this real quick let me hit
Save, pull this up again a little bit.
2:45
Actually I can push up arrow to
get back to our last command and
2:51
there you go, you can see we
get all of the IDs, perfect.
2:56
So we've got our ID fixed,
next is going to be the name.
3:03
I pop in our data.py, we see we have name.
3:09
Now we know in our directions,
number one here we need to split
3:13
the full name into two fields,
first name and last name.
3:18
And a little hint here,
I use the word split on purpose because we
3:23
can use Python's split
functionality to do that for us.
3:28
So I'm gonna do fixed.
3:33
We're gonna add a new key called first_name.
3:35
I set that equal to and
I'm gonna do user["name"]
3:40
and then we're going to split this,
we're gonna call split.
3:46
And we're gonna split it on the space.
3:50
So empty strings.
3:53
I'm gonna put one space inside of it.
3:55
We're gonna split on the space.
3:57
Oops, I need to be outside
the parentheses and
4:00
then we want the first part that's
returned, remember index starts at zero.
4:03
So let's check out this
right here in the console.
4:08
I'm gonna scroll this up so
we can make sure we can see it,
4:13
in our console the same time.
4:16
So I'm gonna do python3
to go into our shell.
4:17
And I'm just gonna create a fake name,
I'll do my own Megan Amendola.
4:20
And now if I'm going to split it,
name.split,
4:27
space, close it and let's see what we get.
4:32
Okay, so when we run split,
we're going to get a list of two items
4:39
because that is what we will
get when we split on a space.
4:43
If I had put my full name in there,
it would give us three things in
4:48
the list because it would give me first,
middle, and last name.
4:52
But because we only have one space,
4:57
it's going to split this
string into two pieces.
4:59
So remember index of zero and
index of one.
5:02
So when I call this, I can then call index
of zero to get just this first part.
5:06
If that seems confusing to you,
5:13
you can instead create a variable
called split_name or you know,
5:15
whatever you want to call it, and
then you can, oops, let's try it again.
5:20
Then you can copy.
5:26
I hate when things start
to go a little funky.
5:30
Let's try again.
5:32
There we go, you could copy this part and
5:35
save it here and then you would just do,
5:39
split_name, oops.
5:44
split_name[0].
5:48
Somehow got to two.
5:54
There we go.
5:55
So you can see if that is a little
bit easier for you to understand,
5:57
calling the split up here which
will give you a list saved
6:02
as your split name or value,
and then calling the first one.
6:08
And then essentially we do the exact
same thing, change this to last.
6:13
And then this would be,
index of one to get the last name.
6:21
So if that's much easier for
you to understand,
6:26
absolutely you can do it that way.
6:29
If you want to do it,
The other way like this,
6:32
you can too and
then you can just delete that variable.
6:38
Either way totally up to you.
6:43
Just gonna leave it that way since
that's the way we had at the end there.
6:46
Exit our show, and run a clear.
6:51
Okay so we have our email and
first name and
6:56
last name all completed,
let's see what is next.
6:59
Date_joined, this is another one
that stays exactly the same.
7:04
So I'm gonna just copy this, and
7:07
I'm just gonna change
this to date_joined and
7:11
date_joined, perfect another one done.
7:16
Next is admin and this is going to
be switching it to a Boolean value.
7:22
Now again, I'm gonna show us
something here in, Python shell.
7:29
So if we have a string,
let's just call it admin anyways,
7:36
and let's set it to False, and
we wanna call the bool value on admin.
7:42
To convert it,
it's always gonna give us true,
7:48
because it's not changing this
from a string into True or False,
7:52
it's changing it and it's saying,
hey, this string has something in it,
7:57
therefore, it is True, cuz that's how
Boolean values work with strings.
8:03
So instead, what we're going to have
to do is do a little comparison,
8:08
a little if statement here.
8:14
So if user["admin"] == True,
8:17
then we need to make
8:24
the fixed["admin"] = True.
8:28
And then we can do,
8:34
oops, else fixed["admin"]
8:39
equals False, okay?
8:45
That tackles our admin and
the last one we need to tackle is ID.
8:51
Now this one we can use
the built in functions,
8:58
so let's do fixed["id"] = int,
9:03
which is how you convert
something to an integer.
9:07
And we can do user["id"].
9:13
Okay, so that tackles all of our fields.
9:17
Now that that's complete,
we need to append this new
9:21
dictionary to our empty list here
at the top, our cleaned list.
9:25
So I'm gonna do cleaned.append(fixed),
9:30
which will append the entire
dictionary that we just built out here.
9:34
And then at the end, once we've finished
looping through all of our users and
9:39
cleaning them, it's gonna return our list
for us so we should be all complete.
9:44
It's already set up here for us to
print it and see it in the console, so
9:50
let me pull this up and
let's check our work.
9:53
All right, so we can see we have a list
here, you can see the close there.
10:00
And then inside we have,
see where it ends.
10:06
Individual dictionary items,
so that's perfect.
10:10
Email is still a string,
we have first name as Warren,
10:14
last name as Bates, awesome.
10:18
So our split name worked.
10:20
Date_joined is still the same.
10:22
Admin is now a Boolean value,
10:25
you can see it doesn't have the string
quotes up next to it anymore.
10:27
And id is now a number.
10:31
Awesome job,
I hope you had fun practicing this skill.
10:34
Keep playing around and keep having fun.
10:37
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up