C# Querying With LINQ Querying the BirdWatcher Data Bird Importing

Aaron Selonke
Aaron Selonke
10,323 Points

Why use the Join() to check for duplicates

Why did Carling use the Join to check for duplicate birds between the two Lists? Why not use the Set Operators? Like Union() ?

3 Answers

Steven Parker
Steven Parker
155,584 Points

An inner join will only return rows where the key fields match. This makes it perfect for finding duplicates.

A Union will return all the unique rows from both sets, effectively concealing any duplication.
:sparkles:

Samuel Ferree
Samuel Ferree
31,707 Points

Is there some reason we couldn't just add the birds with:

//Add imported birds, where the list of bird names does not contain their name
birds.AddRange(importedBirds.Where(ib => !birds.Select(b => b.CommonName).Contains(ib.CommonName)));

Seems like this Joining and Flattening and new anonymous datatypes not really necessary.

Luis Marsano
Luis Marsano
19,774 Points

Could be less efficient if birds.Select(b => b.CommonName) is recomputed for each item of importedBirds. However, just replacing an expression with its value (var birdNames = birds.Select(b => b.CommonName); and !birdNames.Contains(ib.CommonName)) eliminates recomputation and yields the same result (due to referential transparency discussed in an earlier video about functional programming). Your code is simpler and clearer (as good code should be) by decomposing a problem into simpler logical expressions and composing together in a straightforward fashion. That's the beauty of functional programming.

Elshad Shabanov
Elshad Shabanov
2,890 Points

Why we don't use Except for defining birds from imported birds list which are not in our birds list? Like this:

importedBirds.Except(birds).Distinct();

Patrick Castle
Patrick Castle
Pro Student 14,312 Points

This can work but only if you provide additional classes that tell the compiler to compare two Bird objects against one or more of its properties. Without doing this, the unique identifier of each individual object instance will be used for comparison and in this case that means that all instances are unique.

The documentation here will provide more info on this: https://msdn.microsoft.com/en-us/library/bb300779(v=vs.110).aspx