Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
How Does Netflix Apply Big Data Tools to Solve these Problems?
3:59 with Craig Dennis and Jared SmithLet's take a look at how Netflix solves their issues.
Terms
New Terms:
- Docker -- Docker is an open-source project that automates the deployment of applications inside software containers, and runs on top of Linux, Windows, Mac, and other operating systems.
- Anomaly Detection -- Anomaly detection is the process of looking for abnormalities in data to discover potentially interesting insights, ranging from security incidents to service failures.
- Node.js -- Node.js is a runtime for the JavaScript language that allows it to be run on the server independently of a browser, and is widely used across many companies and projects as way to use JavaScript as a general purpose and widely deployable language.
Learn More
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Netflix applies many of the tools and
0:00
frameworks to the domains of big
data that we have discussed,
0:02
to solve the problems that they have
at operating on such a massive scale.
0:05
From utilizing processing frameworks, to
storage systems, to infrastructure tools,
0:09
Netflix exemplifies many of the actual
use cases of big data tools.
0:14
So let's dive into three particular
examples of how Netflix applies tools that
0:18
we've discussed, to solve their problems.
0:22
So let's take a look at the Data Pipeline.
0:25
Netflix ingest data using Apache Kafka.
0:28
It has two sets of Kafka clusters.
0:31
The first cluster is at the front end, and
0:34
gets its messages from clients
that produce messages.
0:36
Which is basically every instance
of the Netflix application.
0:39
It sends those messages to the back end
services, using the second Kafka cluster.
0:43
Which consumes messages, and
sends them to services, like Apache,
0:48
Spark, and Hadoop, for processing and
other Netflix custom application.
0:51
The front end Kafka instance also
passes data to storage layers,
0:56
such as Amazon S3, and
to the search tool, Elasticsearch.
1:01
This is used for both storing data in
short and long term storage on Amazon web
1:05
services, as well as for searching
across all events using Elasticsearch.
1:10
To ensure resiliency
of the Kafka clusters,
1:15
there is a replication factor of 2 for
all the messages in the system.
1:18
Which means that they will
always have a backup message for
1:22
every message in the system.
1:25
Messages are retained
in the Kafka system for
1:27
up to 24 hours, even after being sent
downstream or to storage layers.
1:29
In total, Netflix operates 36 Kafka
clusters, which consume an average
1:34
of 700 billion messages per day, and
up to 8 million events per second.
1:40
Netflix uses machine learning
to produce recommendations for
1:46
each of their users based on what
videos they have watched and liked.
1:49
So let's look at the problem of
serving those recommendations.
1:53
Each of the Netflix
clients produce messages,
1:57
which are adjusted by a Kafka cluster,
and then sent to the back end for
2:00
processing offline with Apache Hadoop,
and Amazon web services.
2:04
Netflix also uses Apache Spark,
with the Spark's streaming extension,
2:09
as well as their own
version of Apache Storm,
2:13
to produce new recommendations
with sub minute latency.
2:16
When their systems need to retrieve
prior results from the machine learning
2:20
algorithms, the back end fetches
the results from a combination of MySQL,
2:24
Cassandra, and Netflix's own caching
engine, which they have also open sourced,
2:28
EVCache, which stands for
Ephemeral Volatile Cache.
2:34
In order to run these services
at such a massive scale,
2:39
Netflix utilizes various other
infrastructure tools and frameworks.
2:42
A major open source infrastructure tool
in heavy use as Netflix is Apache Mesos.
2:46
Mesos is a resource management
service that can be used to
2:51
run clusters of machines,
schedule jobs and services, and much more.
2:54
It runs thousands of containers and
2:59
services at Netflix that help to keep the
back end and user facing services online.
3:01
Now, specifically, it runs a mix of Batch,
Stream Processing, and
3:06
Service Style Workloads.
3:11
Their use cases for
MESOS include Real Time Anomaly Detection,
3:12
Training and Model Building Batch Jobs,
3:17
Machine Learning Orchestration,
and Node.js Based Microservices.
3:19
Netflix has built several internal big
data tools for resource management.
3:24
Some examples are Mantis, which is
a reactive stream processing engine,
3:29
which processes up to 8
million events per second.
3:33
Titus is a docker container job
management and execution platform.
3:36
Meson is a general purpose workflow
orchestration & scheduling framework
3:40
that powers hundreds of concurrent
Machine Learning pipelines.
3:45
Netflix uses a wide range of big data
tools across their entire stack, and,
3:49
therefore, are such a great example
of how far an organization can
3:53
take the things that we've
learned about in this course.
3:56
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up