Normalization eliminates the potential for update anomalies and the enforcement of data integrity. But what does that mean?
Review further explanations and step by step guides to normalization.
- Example of a step by step normalization
- Database Normalization: A Step-By-Step-Guide With Examples
- Ridiculously Unnormalized Database Schemas – Part One
As with most IT topics, there are many books available on the subject. This is a good book for beginning database designers: Beginning Database Design Solutions, by Rod Stephens
After being introduced to the process of database normalization, 0:00 you might be wondering why go to all that trouble? 0:04 Think about a spreadsheet with millions of rows, maybe sales data. 0:08 Each of those rows contain some sort of descriptive value, 0:12 maybe it's the product name. 0:16 All the storage space used to store the text of those product names over and 0:18 over again can really add up. 0:22 If you were to condense the list of product names down to a discrete 0:25 distinct list and store it separately from the main dataset, 0:29 you would drastically reduce the amount of disk storage space required. 0:34 Instead of storing the product names over and 0:39 over again, you can store the products IDs. 0:41 Now, imagine that inefficiency repeated over and over again for 0:45 the other columns in the spreadsheet. 0:49 Such as the product category, store location and so on. 0:52 Normalizing each of these lists of text becomes crucial to 0:55 reducing disk space usage. 0:59 In the 1970s and 80s when companies were fast investing in the technology to 1:01 computerize their business systems, this forward thinking design 1:06 approach literally saved millions of dollars in hardware costs. 1:10 This storage cost have come down dramatically since then, but 1:15 we're storing more and more data than ever before. 1:18 So normalization is still a requirement. 1:21 Let's think about our original large spreadsheet again. 1:24 We see all the products listed out along with the rest of the cells data, 1:28 let's say one of the product name's changes. 1:33 We need to change it. 1:36 Look up all the other places would need to make a change too. 1:37 We hope we'd better not miss one. 1:40 What happens if someone queries this dataset before 1:42 we are completed with our updates? 1:46 They could possibly see two different spellings for the same product. 1:48 The data is inconsistent which is never good. 1:52 In our normalized state, we can just change it in one place, 1:55 in the products table. 2:00 Changing one row of a thousands is, 2:02 as you can imagine, a more optimal database operation. 2:05 The problem with poorly structured data that I just described 2:09 is called an update anomaly. 2:13 This is a huge plus to normalization, eliminating the potential for 2:15 update anomalies and the enforcements of data integrity. 2:19 It's also more efficient for an applications to make an update to one row 2:24 in a table opposed to millions of rows. 2:28 In the world of split second application response times 2:31 that time difference is critical to provide a positive user experience. 2:35 By now, you should be starting to see the importance of good database normalization. 2:40 There is a lot of great information online regarding normalization, be sure to check 2:46 the teacher's notes for some links and suggested reading to learn more. 2:50
You need to sign up for Treehouse in order to download course files.Sign up