Bummer! This is just a preview. You need to be signed in with a Basic account to view the entire video.
Cleaning and Preparing Data1:31 with Alyssa Batula
Preparing data for analysis is vital for working with and translating your data.
Finding a Dataset
- Kaggle competition datasets
- National Health and Nutrition Examination Survey
- The Star Wars API (SWAPI)
- Scraping for Craft Beers: A Dataset Creation Tutorial
Welcome to the last stage of this course.
We've covered a lot of material.
So let's go over it one more time before you move on.
We've learned the basics of data cleaning and
ways to make sure that our data is correct and accurate.
If you feel like cleaning your data set takes a long time,
remember that it's normal.
Properly cleaning your data takes a lot of work, but
it's vital to the quality of your analysis.
We covered eight types of data errors or problems.
Formatting errors, incorrect data type, nonsensical data entries,
duplicate entries, missing data, saturated data, systematic and
individual errors, and confidential information.
You should have an idea of how to start cleaning up these problems when you see
them in your data.
We also talked about how important it is to understand your dataset.
A lot of errors can go unnoticed if you're not sure what your data should look like,
or what's normal.
It's also important when deciding which data should be kept in your dataset, and
what should be removed.
And since we’re using Python,
we talked about ways we can automate parts of our data analysis to help it go faster.
That’s all for this course, but there’s always something new to learn.
In the teacher’s notes, you can download marked down commented versions
of the code we used throughout this course.
You can use these to remind yourself what we’ve covered in the lessons.
There are also links to resources for learning more about data cleaning and
Now there is one more quiz and then you are done with this course.
Congratulations and good luck.
You need to sign up for Treehouse in order to download course files.Sign up