Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
Normalization eliminates the potential for update anomalies and the enforcement of data integrity. But what does that mean?
Review further explanations and step by step guides to normalization.
- Example of a step by step normalization
- Database Normalization: A Step-By-Step-Guide With Examples
- Ridiculously Unnormalized Database Schemas – Part One
As with most IT topics, there are many books available on the subject. This is a good book for beginning database designers: Beginning Database Design Solutions, by Rod Stephens
After being introduced to the process
of database normalization,
0:00
you might be wondering why
go to all that trouble?
0:04
Think about a spreadsheet with
millions of rows, maybe sales data.
0:08
Each of those rows contain some
sort of descriptive value,
0:12
maybe it's the product name.
0:16
All the storage space used to store
the text of those product names over and
0:18
over again can really add up.
0:22
If you were to condense the list of
product names down to a discrete
0:25
distinct list and
store it separately from the main dataset,
0:29
you would drastically reduce the amount
of disk storage space required.
0:34
Instead of storing
the product names over and
0:39
over again,
you can store the products IDs.
0:41
Now, imagine that inefficiency
repeated over and over again for
0:45
the other columns in the spreadsheet.
0:49
Such as the product category,
store location and so on.
0:52
Normalizing each of these lists
of text becomes crucial to
0:55
reducing disk space usage.
0:59
In the 1970s and 80s when companies
were fast investing in the technology to
1:01
computerize their business systems,
this forward thinking design
1:06
approach literally saved millions
of dollars in hardware costs.
1:10
This storage cost have come down
dramatically since then, but
1:15
we're storing more and
more data than ever before.
1:18
So normalization is still a requirement.
1:21
Let's think about our original
large spreadsheet again.
1:24
We see all the products listed out
along with the rest of the cells data,
1:28
let's say one of
the product name's changes.
1:33
We need to change it.
1:36
Look up all the other places
would need to make a change too.
1:37
We hope we'd better not miss one.
1:40
What happens if someone
queries this dataset before
1:42
we are completed with our updates?
1:46
They could possibly see two different
spellings for the same product.
1:48
The data is inconsistent
which is never good.
1:52
In our normalized state,
we can just change it in one place,
1:55
in the products table.
2:00
Changing one row of a thousands is,
2:02
as you can imagine,
a more optimal database operation.
2:05
The problem with poorly structured
data that I just described
2:09
is called an update anomaly.
2:13
This is a huge plus to normalization,
eliminating the potential for
2:15
update anomalies and
the enforcements of data integrity.
2:19
It's also more efficient for an
applications to make an update to one row
2:24
in a table opposed to millions of rows.
2:28
In the world of split second
application response times
2:31
that time difference is critical to
provide a positive user experience.
2:35
By now, you should be starting to see the
importance of good database normalization.
2:40
There is a lot of great information online
regarding normalization, be sure to check
2:46
the teacher's notes for some links and
suggested reading to learn more.
2:50
You need to sign up for Treehouse in order to download course files.
Sign up