You found it first ⚡ 50% off any plan for 6 months, exclusive to new subscribers for a limited time only.

Join the Treehouse affiliate program and earn 25% recurring commission!

✨ Earn college credits in Cybersecurity, JS, HTML, CSS and Python

🌟 Dreaming of a bright future? 🎓 Ask about the Treehouse Scholarship program! 🚀

Well done!

You have completed Introduction to Data Security!

Sign up for Treehouse Back to Library

Preview

Sign up for Treehouse Continue

Hashing

2:31 with Kenneth Love

What goes into a great hash?

Teacher's Notes
Questions?
Video Transcript
Downloads
Workspaces

Hashing is a one-way street; data that has been hashed cannot be unhashed. Some hashing algorithms, though, have been compromised through what's known as hash collisions. This is where two different inputs can generate the same hash. More damaging are chosen-prefix collision attacks, where attackers can use a known part of a document to generate a hash that matches another, valid document. Using this, they can spoof encryption keys and other symmetrically-encrytped documents.

To combat both of these, we add salts to our hashes and use hashing algorithms that either have no known (yet) collisions or stress the machines used to compute them. Both of these techniques can dissuade attackers.

Shattered

I misspoke in the video and said the SHA-1 collision detection came out of Adobe. It was actually discovered at Google by Project Zero/Google Research Security. You can find out more here.

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Sign up

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Sign up

One way to keep data safely stored is to make it unreadable and unrecoverable. 0:00

Now that might not make much sense but 0:04

it all depends on what you need to do with the data. 0:06

You probably want to keep your users data in format that they can read, but 0:08

do you need to do that with their password? 0:11

No, not really. 0:12

Hashing is a wide area that a lot of application such as checksounds 0:15

validate and file contents, lastly compression and message authentication. 0:18

One of the most common though is to reliably encode data. 0:22

This is done with something known as a hash function which takes in data 0:25

using one or more algorithms, turns that data into a hash. 0:28

The same data in the same hash function will produce the same hash every time. 0:32

Seems like a great way to safely store your sensitive data? 0:36

It certainly can be, but if you need to get that data back, you're out of luck. 0:39

Hashing is a one way street and data that's been hashed can't be unhashed. 0:43

This is the feature though, since we can use hashing on data that we really 0:47

shouldn't be able to read but need to verify. 0:50

Take passwords for example. 0:52

As a service like Amazon or Twitter, I should never know what your password is. 0:54

I should take it, hash it, and store that hash in my database. 0:57

Then when you want to log in, I take the password you give me, hash it the same way 1:01

and compare the resulting hash to what I have stored in the database. 1:04

If it's the same, then great, you're you. 1:07

If they're not, well, I'm not gonna let you in. 1:09

You might have just thought of another problem with hashing. 1:12

If you and I both share the same password on a site and I try to log in to 1:14

your account, our hashes would match, I just got into your account. 1:17

Thankfully, most hashing algorithm have a feature known as a salt. 1:21

A salt is a bit of unique data, like maybe the Unix time stamp of when we signed up 1:24

that's added to every hash that will change the output. 1:28

Now, even though we have the same characters in our password, 1:31

since we each have a different salt, the hash comes out differently. 1:33

There are a lot of hash functions out there with a various degrees of 1:37

difficulty to them. 1:39

By degree of difficulty, I mean how hard the resulting hash is to break or reverse. 1:40

Many of them would require more computing time 1:45

than an attacker would consider reasonable. 1:47

There are some hash functions though that are now considered insecure. 1:49

The MD5 algorithm, for example, was a very commonly used hash function in the earlier 1:52

days of the modern worldwide web. 1:57

It's now considered to be severely compromised though, 1:59

as it doesn't require a large amount of computing power to find collisions between 2:01

two inputs within hours, minutes or even seconds. 2:04

A collision is where a different inputs generate the same hash. 2:08

Recently, Adobe announced that they had discovered collisions in the also popular 2:11

SHA1 hash function. 2:16

If you're still using MD5 or SHA1, 2:17

update them immediately to a better hash function like SHAKE or Argon2. 2:20

So, what if you need to safely store data but also need to be able to read it? 2:25

We'll talk about encryption in the next video. 2:28

You need to sign up for Treehouse in order to download course files.

Sign up

You need to sign up for Treehouse in order to set up Workspace

Sign up