Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Regular Expressions in Python Introduction to Regular Expressions Groups

Brendan Whiting
seal-mask
.a{fill-rule:evenodd;}techdegree seal-36
Brendan Whiting
Front End Web Development Techdegree Graduate 84,735 Points

Help with debugging reg expressions

I'm following along with this video, and my code is outputting just an empty array with no errors. What am I doing wrong and how do I debug this?

Here is my code:

import re

names_file = open("names.txt", encoding="utf-8")
data = names_file.read()
names_file.close()

print(re.findall(r'''
    ([-\w ]+,\s[-\w ]+)\t           # first and last names
    ([-\w\d.+]+@[-\w\d.]+)\t         # email
    (\(?\d{3}\)?-?\s?d{3}-\d{4})\t  # phone number
    ([\w\s]+,\s[\w\s]+)\t          # job, company
    (@[\w\d]+)                 # Twitter
''', data, re.X))

Here here is the text of names.txt:

Love, Kenneth kenneth@teamtreehouse.com (555) 555-5555 Teacher, Treehouse @kennethlove McFarland, Dave dave@teamtreehouse.com (555) 555-5554 Teacher, Treehouse Arthur, King king_arthur@camelot.co.uk King, Camelot Österberg, Sven-Erik governor@norrbotten.co.se Governor, Norrbotten @sverik , Tim tim@killerrabbit.com Enchanter, Killer Rabbit Cave Carson, Ryan ryan@teamtreehouse.com (555) 555-5543 CEO, Treehouse @ryancarson Doctor, The doctor+companion@tardis.co.uk Time Lord, Gallifrey Exampleson, Example me@example.com 555-555-5552 Example, Example Co. @example Obama, Barack president.44@us.gov 555 555-5551 President, United States of America @potus44 Chalkley, Andrew andrew@teamtreehouse.com (555) 555-5553 Teacher, Treehouse @chalkers Vader, Darth darth-vader@empire.gov (555) 555-4444 Sith Lord, Galactic Empire @darthvader Fernández de la Vega Sanz, María Teresa mtfvs@spain.gov First Deputy Prime Minister, Spanish Govt.

2 Answers

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 68,423 Points

Hey Brendan,

I was able to debug this by commenting out each line of the regex and fixing it until I got the 12 output lines. I then uncommented one line and fixed it, etc. Here is my solution:

import re

names_file = open("names.txt", encoding="utf-8")
data = names_file.read()
names_file.close()

# original re
# print(re.findall(r'''
#     ([-\w ]+,\s[-\w ]+)\t           # first and last names
#     ([-\w\d.+]+@[-\w\d.]+)\t         # email
#     (\(?\d{3}\)?-?\s?d{3}-\d{4})\t  # phone number
#     ([\w\s]+,\s[\w\s]+)\t          # job, company
#     (@[\w\d]+)                 # Twitter
# ''', data, re.X))

results = re.findall(r'''
    # replace + with * to catch empty last_name
    ([-\w ]*,\s[-\w ]+)\t           # first and last names
    ([-\w\d.+]+@[-\w\d.]+)\t         # email
    # added missing \ in front of second digits field
    # added trailing ? to make phone field optional
    (\(?\d{3}\)?-?\s?\d{3}-\d{4})?\t  # phone number
    # added missing . in company name 
    ([\w\s]+,\s[\w\s.]+)\t?          # job, company
    # added ? to make last field optional
    (@[\w\d]+)?                 # Twitter
''', data, re.X | re.MULTILINE)
# added missing 're.MULTILINE' option to parse multiline input spliting on EOL

# added index printing to track who was matching not matching
for index, item in enumerate(results):
    print(index+1, item)
Josh Keenan
Josh Keenan
19,652 Points

Try adding re.M at the end, post again if it doesn't help!