Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Regular Expressions in Python Introduction to Regular Expressions Email

regex sets trailing comma

This is challenging; I've passed this challenge earlier this summer. I can't pass it now.

SHOULD RETURN: "kenneth@teamtreehouse.com"

but it erroneously returns a trailing comma at the end: "kenneth@teamtreehouse.com,"

Thank you!

sets_email.py
import re

# Example:
# >>> find_email("kenneth.love@teamtreehouse.com, @support, ryan@teamtreehouse.com, test+case@example.co.uk")
# ['kenneth@teamtreehouse.com', 'ryan@teamtreehouse.com', 'test@example.co.uk']

def find_emails(string):
    return re.findall(r'\w+\W?\w*@\w+.\w+.?\w*', string)

# Got       ['kenneth@teamtreehouse.com,', 'andrew+gotcha@teamtreehouse.com,', 'exa.mple@example.co.uk'], 
# expected  ['kenneth@teamtreehouse.com', 'andrew+gotcha@teamtreehouse.com', 'exa.mple@example.co.uk']. 

2 Answers

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 68,423 Points

Challenging question! The issue is trying to capture the second optional period and domain extension. By using ".?\w*" with no qualifiers, the period means any optional character followed by zero or more word characters. This is fine in the third match, but causes the comma to be accepted in the first two matches.

The fix is to specify the period is literal and not a wildcard. Precede it with a backslash to look for a literal period: "\.?\w*'"

If you’re looking for a more dense and readable solution, try using of character sets to list groups of valid characters, such as:

  • [.+\w]+
  • [.\w]+

Thanks Chris!! So when you say:

the period means any optional character

does that mean anything at all, including letters, numbers, etc.?

Chris Freeman
Chris Freeman
Treehouse Moderator 68,423 Points

Yes, in regex, a β€œ.” (period) means any character. So β€œ.?” means any optional character. And β€œ.*” means any number of any character. A literal period needs the backslash.

Brendan Whiting
seal-mask
.a{fill-rule:evenodd;}techdegree seal-36
Brendan Whiting
Front End Web Development Techdegree Graduate 84,735 Points

I got it to work by adding the 'word boundary' \b at the end of the regex.

def find_emails(string):
    return re.findall(r'\w+\W?\w*@\w+.\w+.?\w*\b', string)