Welcome to the Treehouse Community

The Treehouse Community is a meeting place for developers, designers, and programmers of all backgrounds and skill levels to get support. Collaborate here on code errors or bugs that you need feedback on, or asking for an extra set of eyes on your latest project. Join thousands of Treehouse students and alumni in the community today. (Note: Only Treehouse students can comment or ask questions, but non-students are welcome to browse our conversations.)

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and a supportive community. Start your free trial today.

Python Regular Expressions in Python Introduction to Regular Expressions Negation

Flore W
Flore W
4,731 Points

Negation Regex I need help with \b

Here is a reminder of Kenneth's code:

import re
names_file = open('names.txt', encoding = 'utf-8')
data = names_file.read()
names_file.close()

print(re.findall(r'''
    \b@[-\w\d.]*
    [^gov\t]+
    \b
''', data, re.X|re.I))

This gives a result of the @ part of email addresses, excluding "gov", a few examples here:

['@teamtreehouse.com', '@camelot.co.uk', '@spain.']

I have tried taking out the two '\b' and it gives the result:

['@kennethlove\nMcFarland, ', '@potus44\nChalkey, Andrew']

I don't understand why taking out the \b will mean that I find combinations including '\n', ',' and ' ' - none of those are contained in my @[-\w.+]*, so why would those appear?

1 Answer

Saikat Chowdhury
Saikat Chowdhury
3,116 Points

Hi Flore, /b is looking for word boundary or edges of the word define by white spaces. Here It will check start and end of the string.

my thinking is , when you remove two \b from your regex script then it started searching twitter records. And Twitter record has @ . Because of + sign it goes to next line . And you are getting below result eg. ['@kennethlove\nMcFarland, ', '@potus44\nChalkey, Andrew']

if you remove + sign from [^gov\t]\b then it will stay on same line, even you remove \b from your regex script.

Regards, Saikat