Cleaning and Preparing Data
Coming July 2018…Watch trailer
- Data Analysis
About this Course
We rely on data to answer important questions, whether we are trying to make the best business decisions or determine the effectiveness of a new medical treatment. But our analyses are only as accurate as the data we are using, and incorrect or “dirty” data can lead to incorrect conclusions and assumptions. Data preparation, also called “cleaning” or “scrubbing”, is an important part of ensuring our analyses are accurate and useful.
What you'll learn
- Cleaning and scrubbing data
- Potential problems within datasets
- Understanding your dataset
- Handling bad data
“Clean” and “Dirty” Data
Welcome! In this stage, you will learn about why having a properly cleaned dataset is important and some of the problems you may encounter when cleaning a dataset. we will also take our first look at the data we will be using throughout this course.
Handling Bad Data
Now that we know a little bit about our dataset and the data cleaning process, we will take a closer look at some common issues using our example dataset. Sometimes these issues can be fixed, while other times it’s best to remove the data from our analyses. We can even write programs to help us automate some of the data preparation process, saving time and effort.
Selecting Relevant Data
While it may seem like more data is always better, usually we only want to look at the information that’s relevant to the question we are trying to answer. In this stage, we will look at different ways of choosing the most applicable data.
Alyssa is an electrical engineer-turned software developer at Cox Automotive Inc. She uses data analysis and machine learning to automate processes and help make computers smarter.