Databases Querying Relational Databases Joining Table Data with SQL Review and Practice

SQL self join

I have perhaps more of a career advice question: I have a graduate school colleague working at Sony PlayStation, who advised me that the proper use of self join querying is valuable in his work.

Is there a video, or another good resource, that covers self joins?

When I attempted learning the self join, last year on w3schools, I found it was quite difficult, and so I would love a great resource.

Thank you very much!

2 Answers

Brendan Whiting
MOD
Brendan Whiting
Front End Web Development Treehouse Moderator 84,129 Points

In case this is helpful, let me try and explain what's going on in the W3Schools example:

SELECT A.CustomerName AS CustomerName1, B.CustomerName AS CustomerName2, A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID
AND A.City = B.City 
ORDER BY A.City;

So, the question they're trying to answer is "What customers live in the same city as each other?". They have a Customers table that has a CustomerName and a City column. So all the information they need is in this one table, they don't need any other tables. In order to do. this, they need to alias the table as two different aliases - FROM Customers A, Customers B - otherwise there's no way for us to put the same table on both the left and the right side of this comparison.

What this is going to output is a list of all possible pairs of customers who live in the same city as each other. So if Seth, Mark, and Brendan all lived in the same city, the possible combinations that it would generate would be:

Seth Mark
Seth Brendan
Mark Seth
Mark Brendan
Brendan Seth
Brendan Mark

Some of these are redundant. Flip them around and it's the same pair. But solving that problem is maybe for later...

Notice that in this part of the query WHERE A.CustomerID <> B.CustomerID it's basically using <> to mean "not equals". So I don't want to pair the customer with themselves, I want to pair them with someone else in the same city.

So Brendan, the number of rows is looking like a permutation calculation. This means if a city had ten rows of people, there would be 90 different pairs (45 unique ones). Interesting.

Brendan Whiting
Brendan Whiting
Front End Web Development Treehouse Moderator 84,129 Points

In this case, yes, but I think there are other use cases for self joins that are not permutations. Can't think of any off the top of my head though.

seth aruby
seth aruby
7,621 Points

Here's something that may help https://stackoverflow.com/questions/3362038/what-is-self-join-and-when-would-you-use-it

To me this example is odd in that maybe you wouldn't structure the database this way. It seems that it would be more appropriate to have an employee table with employee subtype tables such as a supervisor table, officer level table and then a self join would not be necessary. So perhaps your db design should be such that Self Joins shouldn't be necessary?? Perhaps that is why Treehouse doesn't address Self Joins?? But I'm very new to database design concepts and best practices so take with many large grains of salt.

Thanks Seth, I appreciate you putting that thought forward. That's a very good hypothesis indeed. Yes, I'd love to hear any other database people chime in.