Welcome to the Treehouse Community
The Treehouse Community is a meeting place for developers, designers, and programmers of all backgrounds and skill levels to get support. Collaborate here on code errors or bugs that you need feedback on, or asking for an extra set of eyes on your latest project. Join thousands of Treehouse students and alumni in the community today. (Note: Only Treehouse students can comment or ask questions, but non-students are welcome to browse our conversations.)
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and a supportive community. Start your free trial today.
regex sets trailing comma
This is challenging; I've passed this challenge earlier this summer. I can't pass it now.
but it erroneously returns a trailing comma at the end:
import re # Example: # >>> find_email("firstname.lastname@example.org, @support, email@example.com, firstname.lastname@example.org") # ['email@example.com', 'firstname.lastname@example.org', 'email@example.com'] def find_emails(string): return re.findall(r'\w+\W?\w*@\w+.\w+.?\w*', string) # Got ['firstname.lastname@example.org,', 'email@example.com,', 'firstname.lastname@example.org'], # expected ['email@example.com', 'firstname.lastname@example.org', 'email@example.com'].
Chris FreemanTreehouse Moderator 68,029 Points
Challenging question! The issue is trying to capture the second optional period and domain extension. By using "
.?\w*" with no qualifiers, the period means any optional character followed by zero or more word characters. This is fine in the third match, but causes the comma to be accepted in the first two matches.
The fix is to specify the period is literal and not a wildcard. Precede it with a backslash to look for a literal period: "
If you’re looking for a more dense and readable solution, try using of character sets to list groups of valid characters, such as:
Brendan WhitingFront End Web Development Techdegree Graduate 84,702 Points
I got it to work by adding the 'word boundary'
\b at the end of the regex.
def find_emails(string): return re.findall(r'\w+\W?\w*@\w+.\w+.?\w*\b', string)