What is Data Cleaning?3:51 with Alyssa Batula
In this video, we will discuss what is meant by cleaning or scrubbing a dataset, and why it’s an important step in data analysis.
- Data Cleaning -- The process of fixing or removing incorrect, incomplete, and irrelevant data from a dataset. Also called data cleansing, preparing, or scrubbing.
- Example -- A single observation, case, or member of a dataset, usually a row in a table.
- Feature -- A descriptive or measurable characteristic of an example in a dataset, usually a column in a table. Also called a variable.
- Raw Data -- Data that has been collected but not cleaned. Also called source, primary, or atomic data.
Why the ‘Boring’ Part of Data Science is Actually the Most Interesting
Data Science: A Kaggle Walkthrough Pt 1: Introduction
Data Science: A Kaggle Walkthrough Pt 2: Understanding the Data
Data Science: A Kaggle Walkthrough Pt 3: Cleaning Data
You need to sign up for Treehouse in order to download course files.Sign up