Python Regular Expressions in Python Introduction to Regular Expressions Email Groups

Bratamalya Das Gupta
Bratamalya Das Gupta
253 Points

Can you tell me why my solution is incorrect?

I get this error: Bummer: Didn't get the right capture. Got "teamtreehouse.com, 555-555-5555, @kennethlove".

I think the regular expression is not able to differentiate the email group from the twitter handle group. But Why?

emails.py
import re

string = '''Love, Kenneth, kenneth+challenge@teamtreehouse.com, 555-555-5555, @kennethlove
Chalkley, Andrew, andrew@teamtreehouse.co.uk, 555-555-5556, @chalkers
McFarland, Dave, dave.mcfarland@teamtreehouse.com, 555-555-5557, @davemcfarland
Kesten, Joy, joy@teamtreehouse.com, 555-555-5558, @joykesten'''
contacts=re.search(r'[\w, ]* (?P<email>[-\w+.]+@[\w+.]+), (?P<phone>[0-9]{3}-[0-9]{3}-[0-9]{4})',string,re.M)
twitters=re.search(r'[-+\w\d,. ]* (?P<twitter>@[\w]+)$',string,re.M)
Frank Genova
Frank Genova
Python Web Development Techdegree Student 11,941 Points

It has been a while since I did this challenge, but I still have my code. I explicitly included the ability to capture an email address that had digits in it. For example "jimmy1989@aol.com". You might think about the possibility of digits in your twitter regex as well. You could see if this helps it pass the challenge. Right now it isn't parsing the first email correctly so that is where you need to focus.

Also, I believe that \d is the same as [0-9]

Finally, I find these websites helpful to test/manipulate regex to understand how they can be built up.

1 Answer

Alex Koumparos
MOD
Alex Koumparos
Python Web Development Treehouse Moderator 30,355 Points

Hi Bratamalya,

The issue is the distinction between the match and the capture group. In your twitters case, your regex creates a match as follows:

>>> twitters=re.search(r'[-+\w\d,. ]* (?P<twitter>@[\w]+)$',string,re.M)
>>> twitters
<re.Match object; span=(33, 78), match='teamtreehouse.com, 555-555-5555, @kennethlove'>

Observe that the match contains all the text (-, +, \w, and \d) characters leading up to the start of the twitter handle. Within the match, you have a capture group called twitter which is the part within the match representing the twitter handle:

>>> twitters.groupdict()
{'twitter': '@kennethlove'}

Since the challenge wants you to return a match object where the whole match is the twitter handle, all you need to do is remove the parts of your match that aren't part of the twitter handle. Thus:

>>> just_twitter_match = re.search(r'(?P<twitter>@[\w]+)$',string,re.M)

And since the challenge isn't asking for a capture group at all, we can even omit that:

>>> just_twitter_match_no_group = re.search(r'@[\w]+$',string,re.M)

Hope that's clear.

Cheers,

Alex