Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Python Collections (2016, retired 2019) Dictionaries Teacher Stats

revisiting 5 dictionaries challenges (an intermediate Python solution)

Hello. I wanted to put my dictionary skills to work in these challenges. Like Chris Freeman said:

It's great to revisit old challenges with new skills

And so I felt like the fourth challenge, most_courses, was the hardest, and I have several questions about it.

  • Does anyone know of a good way to code this using reduce and/or filter?

  • I tried moving the final 3 lines of most_courses into a single line; can anyone do this?

  • Also, in my console I had success by ending most_courses with a yield instead of a return, in the event that the teachers are tied. I didn't think it would pass the Treehouse challenge, since I'd need a list() on the outside of the function call... unless anyone knows some other way?

Thanks in advance!

teachers.py
# The dictionary will look something like:
# {'Andrew Chalkley': ['jQuery Basics', 'Node.js Basics'],
#  'Kenneth Love': ['Python Basics', 'Python Collections']}
#
# Each key will be a Teacher and the value will be a list of courses.
#
# Your code goes below here.
from operator import add
from functools import reduce

def num_teachers(dict):
    return len(dict.keys())

def num_courses(dict):
    return reduce(add, map(len, dict.values()))

def courses(dict):
    uglylist = [value for value in dict.values()]
    flat_list = sum(uglylist, [])
    return flat_list

def most_courses(dict):
    dict_teacher_numcourses = {teacher: len(list_courses) for teacher, list_courses in dict.items()}
    for teacher in dict_teacher_numcourses.keys():
        if dict_teacher_numcourses[teacher] == max(dict_teacher_numcourses.values()):
            return teacher

def stats(dict):
    return [[teacher, len(list_courses)] for teacher, list_courses in dict.items()]

2 Answers

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 68,441 Points

Hi Mark, challenge excepted! Here's a single line solution to Task 4:

def most_courses(dict):
    # dict_teacher_numcourses = {teacher: len(list_courses) for teacher, list_courses in dict.items()}
    # for teacher in dict_teacher_numcourses.keys():
    #     if dict_teacher_numcourses[teacher] == max(dict_teacher_numcourses.values()):
    #         return teacher
    # 
    # Using list.sort()
    # list_teacher_numcourses = [(len(list_courses), teacher) for teacher, list_courses in dict.items()]
    # list_teacher_numcourses.sort(reverse=True)
    # return list_teacher_numcourses[0][1]
    #
    # Using sorted()
    # return sorted([(len(list_courses), teacher)
    #                for teacher, list_courses in dict.items()], reverse=True)[0][1]
    #
    # Using max
    return max((len(list_courses), teacher) for teacher, list_courses in dict.items())[1]

You can see my iterations through various solutions. At first I used a list comprehension to build a tuple list that can then be sorted by number of courses. The most courses will be item 0 and the teacher name will be tuple item 1.

The builtin sorted() is used instead of list.sort(), because list.sort() operates in-place hence an additional line 2 lines would be needed to assign list to a variable, then sort, then return an index into the list. Since I was looking for only the maximum value, the builtin max() works even better.

Then to reduce further, by replacing the list comprehension square brackets with parens, you get a generator expression! This provides yield-like performance.

Keep in mind, as the data sets get bigger, it is important to consider the size of the solution in memory. Solutions that cause the data to be fully expanded into a dict or list comprehension can become huge memory hogs.

Regarding using reduce, one of the reasons it have been moved from a builtin function to being part of functools, is because it wasn't found to be superior to using a for loop. According to the Python 3 release notes:

"Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable."

Yes that's a great readable solution. Thanks for providing your comments to explain it thoroughly well too, Chris!

So, you wrote a for-loop comprehension (is that what it's called? i.e., a single-line for-loop?). Or is that the generator comprehension you're talking about?

And you put this single-line for-loop into the max() function? I didn't know max could return a tuple. How does max know to take the max on the number, and not the teacher's name? (Is the number index 0, and name index 1, respectively?)

The Python 3 release notes on functools.reduce() were really helpful! When Guido says:

"Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable."

Do you think he's referring to using a for-loop with a variable initialized outside the loop? e.g.:

sum = 0
for i in iterable:
    # pseudo-code
    sum += i

or...

pylist = []
for i in iterable:
    # pseudo-code
    pylist.append(i)

So many questions :) I appreciate your insight, Chris! Very very much!

Chris Freeman
Chris Freeman
Treehouse Moderator 68,441 Points

The "for-loop comprehension" is a generator expression looks like a list comprehension except it is bounded by parens instead of square brackets. You may ask where are the parens in the max() statement. As a short cut, note that "Generator expressions always have to be written inside parentheses, but the parentheses signalling a function call also count."

When I used "generator comprehension", I meant to say "generator expression". The previous comments and answers have been corrected.

# list comprehension, returns list of (int, name) tuples:
[(len(list_courses), teacher) for teacher, list_courses in dict.items()]
# generator, returns iterable of (int, name) tuples:
((len(list_courses), teacher) for teacher, list_courses in dict.items())

The max() function expects an iterable. For each item in the iterable it compares to the previous maximum. If the item compared is also an interable, it compares the first item (in this case, the number of courses) from both the current max and the next comparable iterable, if they are the same, then it compares the next item (in this case, the teacher's name) from both iterables. Since I specified the tuple as (len(list_courses), teacher), the lengths are compared first.

By using the generator expression as an argument, the max() function iterates over the generator output returning the maximum item output from the generator whether it is a simple object or something more complicated.

In the end, a single tuple is returned from the max() function. Since the second item is the desired target, using [1] indexing gives the teachers name. The index [0] would have returned the length maximum.

I see. That explanation makes so much sense now! So in the event of a tie, as there is in this dictionary (both Kenneth and Andrew teach two courses), their names serve as a tie-breaker, inadvertently. "K" > "A", so Kenneth wins.

Thanks so much for clarifying, Chris!

Hi Chris. I can't thank you enough for how much you've helped me. This generator expression conversation has led my curiosity to the next challenge-review I hit in tuples, and I quote what you've said. I just felt you deserved the credit. Again, thanks!