Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Video Player
00:00
00:00
00:00
- 2x 2x
- 1.75x 1.75x
- 1.5x 1.5x
- 1.25x 1.25x
- 1.1x 1.1x
- 1x 1x
- 0.75x 0.75x
- 0.5x 0.5x
Before we continue, we should formally define some of the terms I've been using to describe machine learning, and then break them down further with more examples.
Vocabulary and Definitions
- Example: A single element in a dataset
- Feature: One characteristic of an example
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up-
Creative Team
4,320 PointsWhy does video restart when I try to click anywhere on the progress bar?
Posted by Creative TeamCreative Team
4,320 Points1 Answer
View all discussions for this video
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
[MUSIC]
0:00
Toward the end of these lessons,
we're going to Python and
0:05
the scikit-learn project to
write our own classifier.
0:08
But before we continue, we should formally
define some of the terms I've been using
0:12
to describe machine learning and
0:17
then break them down
further with more examples.
0:19
Speaking of examples, an example
is a single element in a dataset.
0:23
Sometimes you might hear an example
referred to as a sample,
0:29
but it means the same thing.
0:35
If your data is formatted in a table,
0:37
an example might be
a single row in the table.
0:40
A dataset is comprised on many examples.
0:45
And in general,
0:48
each example helps improve the confidence
of your model's predictions.
0:49
Say for instance, you're running a movie
studio and you want to try an forecast
0:55
how much money a movie might make,
so that you can set a budget.
1:00
Your dataset would probably
be examples of older movies.
1:04
So what about those older
movies might you include?
1:09
Each part of an example
is called a feature.
1:13
A feature is one
characteristic of an example.
1:17
Again, if you formatted
your data in a table,
1:22
each feature might be a single column.
1:25
In the case of predicting a movie's box
office performance, your older examples of
1:29
movies might include things like
their total box office sales.
1:34
The budget, the genre, release date and
1:38
maybe more advanced features,
like a star power calculation.
1:41
Which could take all the actors in each
movie and calculate a weighted average of
1:46
their typical box office performance
in other movies they've been in.
1:50
A dataset might contain good and
bad features.
1:56
And some features that are more
important than others.
2:00
For example,
you might find that the genre and
2:04
release date is more
important than the budget.
2:06
So your model could weigh
those features more heavily.
2:10
A feature that might be completely
irrelevant is the movie's title.
2:14
Sure a movie needs a title and you might
be able to come up with a machine learning
2:19
model that can determine what
makes a good and bad movie title.
2:23
But in most cases,
it's probably too subjective and
2:28
inconsequential to weigh it against
other more quantifiable features.
2:31
Something like the box office performance
of a movie is very difficult to predict.
2:37
And it includes a huge number
of factors that are nearly
2:42
impossible to simulate perfectly.
2:45
But that's why a model is
nothing more than that.
2:48
A model or
a simplification of the problem.
2:51
It's just one tool that can be used
in combination with other approaches
2:55
to arrive at a solution.
3:00
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up