Welcome to the Treehouse Community

The Treehouse Community is a meeting place for developers, designers, and programmers of all backgrounds and skill levels to get support. Collaborate here on code errors or bugs that you need feedback on, or asking for an extra set of eyes on your latest project. Join thousands of Treehouse students and alumni in the community today. (Note: Only Treehouse students can comment or ask questions, but non-students are welcome to browse our conversations.)

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and a supportive community. Start your free trial today.

Python Regular Expressions in Python Introduction to Regular Expressions Negated Numbers

Mary Yang
Mary Yang
1,769 Points

Python Regular Expressions - Display all numbers except 567

Hi, I've been searching the internet, checking the Python documentation, and watching the video and I can't figure out how to do this. I've tried several times, but I can't get it to work.

How do I display all the numbers in the string except for 567?

negate.py
import re

string = '1234567890'

def good_numbers():
  return re.findall(r'[\d]*[^567]', string)

10 Answers

Kenneth Love
STAFF
Kenneth Love
Treehouse Guest Teacher

Remember, the challenge wants you to make sure you don't have 5, 6, or 7 in the match. Don't worry about anything else, just keep those items out.

Mary Yang
Mary Yang
1,769 Points

Kenneth Love I understand the question, I just don't know how to keep those out and return everything else. When I try I get the last few digits or I get none. I have yet to find an answer anywhere. Can you provide more insight please? I'm sure I could do it using a loop, but that would defeat the point of using regex, right?

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

I can almost guarantee you're overthinking this. I know I did when I first wrote it. All you need to do is exclude those three characters. How do you exclude characters in regex? No loops required. Your pattern will be like 6 characters long.

Mary Yang
Mary Yang
1,769 Points

[^567] returns only 1-4, not '123489'. I've tried about a 1000 variations of that adding groups, adding stuff like \d, \w, \d. to the front, back, to it's own group, Combinations like [^5-7]\d don't work, neither does stuff like \d.[^567]\d+. I've tried every combination of +.*? I can think of. I'm at a loss. I'm doing another tutorial on regex right now hoping to find a solution, but it's not covering how to exclude stuff in the middle of a string, only how to exclude stuff at the beginning or end. I'm about to give up and just ask on stack overflow. :(

edit: I know I can define two variables, one to grab 1-4, and another to grab 8-9 and then join the two strings together to get the final string, but I'm almost positive that's not the answer you're looking for. It seems too convoluted based on the videos.

edit again: OK, I think it has something to do with non-capturing groups? I don't believe we covered this in the videos. Based on what I've read, \d+(?:567)? should work, right? It's still printing out 567 though.

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

Can you show me exactly what you're trying? I just passed it with this:

good_numbers = re.findall(r'[^567]', string)

and with:

good_numbers = re.findall(r'[^5-7]', string)
Mary Yang
Mary Yang
1,769 Points

Mrglglrglglglglglglgl!!!!!!

I was using match and search! /wrists.

I had passed the test back when it accepted my incorrect answer. I've since been trying to get the right answer in workspaces in a tests.py file I made. Somewhere along the way I guess I changed findall to search and then match. OMG these things!

Thank you so much for your help. I was about to throw my computer into the oven and set it to broil.

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

Absolutely nothing in the history of programming is responsible for more broiled computers than regular expressions.

table flip

Challenge Task 1 of 1 repost correta

Create a variable named good_numbers that is an re.findall() where the pattern matches anything in string except the

numbers 5, 6, and 7.

import re

string = '1234567890'

good_numbers = re.findall(r'[^567]', string)

Keep going to meet your Goal Ary.

Vittorio Somaschini
Vittorio Somaschini
33,371 Points

Hello Mary!

First thing I have noticed is that you don't have to set up a function here as the challenge only requests a variable. So the last line would be:

good_numbers = {some code here}

The code that goes there has to find all the numbers in the string (d[,9]), apart from 567 and you had this right.

So, final code would be:

good_numbers = re.findall(r'\d[,9][^567]', string)

Hope that helps, if unclear please ask

Vittorio

Mary Yang
Mary Yang
1,769 Points

Hi Vittorio,

Thanks for taking the time to look into this. You're correct. The only issue was that I was creating a function instead of a simple variable. I just input it and it worked. Crazy how after awhile I can miss the most basic things like this.

Thanks again for your help!

Dan A
Dan A
Courses Plus Student 4,036 Points

The match object for your pattern is ['890']. This answer, however, is accepted by the CC. The correct answer should result in ['1234890'], right?

Mary Yang
Mary Yang
1,769 Points

Hi Dan A, yes it should be ['1234890']. If my code is outputting the wrong list, how come it was accepted? How should the correct code be written? I'm really struggling with some of these regex challenges.

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

Thanks for finding that Dan A (and Mary Yang). I've fixed the CC so it won't take ['890'] as being valid.

Mary Yang
Mary Yang
1,769 Points

Kenneth Love, can you provide any feedback on how to get the correct answer? I've been combing over the python documentation, checking stack overflow, and have re-created the code about 1000 times in a workspace file I made for test code.

I'm having a very difficult time with the regex stuff. I understand it perfectly when you are talking about it, but when I try to make groups, I seem to always be grabbing all the text, none of the text, or the text isn't split into the proper groups.

This has by far been the hardest part of the Python track for me. I feel like I understand the material until I try to implement it for the tests and I bomb. I haven't passed 3 out of the last four coding challenges.

Dan A
Dan A
Courses Plus Student 4,036 Points

Mary, I still can't figure this one out as well!

Vittorio Somaschini
Vittorio Somaschini
33,371 Points

ahahah

Kevin, could you please tell me what happens precisely if I pass in this?

good_numbers = re.findall(r'\d*[^567]', string)

Because it was my first try for this exercise and good numbers comes out like this ['1234567890'], so it gives back all the numbers. I would like to know precisely what the compiler does.

TY

Vittorio

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

I'll guess that by "Kevin" you mean "Kenneth". :D

So what does r'\d*[^567]' match? First, it matches an infinite number of numbers, so the entire string would be matched. Then, once it hits infinity (or something that isn't a number), it'll exclude 5, 6, and 7. You really have to be careful with *.

Vittorio Somaschini
Vittorio Somaschini
33,371 Points

OOps so sorry about that...

It looks like someone needs a break!!

ty Kenneth.

BTW that gif is awesome!!

Since this is the only forum thread that deals with the excluded (or negated) numbers 567 challenge

I just wanted to noted the exact word challenge (for the spidering/crawling of search engines):

"Create a variable named good_numbers that is an re.findall() where the pattern matches anything in string except the numbers 5, 6, and 7."

..as well as the link to the challenge:

http://teamtreehouse.com/library/regular-expressions-in-python/introduction-to-regular-expressions/negated-numbers

//*******************************************************

Note: I didn't find anything in the address_book.py zipped file (that you can download for this course)

which has any sort of line of code that could server as a good pattern example

of using the exclude character (the caret ^) by itself in a re.final() context to exclude a set (or subset) of numbers (or numbers as a set of string characters) :

import re

names_file = open("names.txt", encoding="utf-8")
data = names_file.read()
names_file.close()

#print(re.match(r'Love', data))
#print(re.search(r'Kenneth', data))
#print(re.findall(r'\(?\d{3}\)?-?\s?\d{3}-\d{4}', data))
#print(re.findall(r'\w*, \w+', data))
#print(re.findall(r'[-\w\d+.]+@[-\w\d.]+', data))
#print(re.findall(r'\b[trehous]{9}\b', data, re.I))
#print(re.findall(r'''
#    \b@[-\w\d.]*  # First a word boundary, an @, and then any number of characters
#    [^gov\t]+  # Ignore 1+ instances of the letters 'g', 'o', or 'v' and a tab.
#    \b  # Match another word boundary
#''', data, re.VERBOSE|re.I))
#print(re.findall(r"""
#    \b[-\w]+,  # Find a word boundary, 1+ hyphens or characters, and a comma
#    \s  # Find 1 whitespace
#    [-\w ]+  # 1+ hyphens and characters and explicit spaces
#    [^\t\n]  # Ignore tabs and newlines
#""", data, re.X))
line = re.compile(r'''
    ^(?P<name>(?P<last>[-\w ]*),\s(?P<first>[-\w ]+))\t  # Last and first names
    (?P<email>[-\w\d.+]+@[-\w\d.]+)\t  # Email
    (?P<phone>\(?\d{3}\)?-?\s?\d{3}-\d{4})?\t  # Phone
    (?P<job>[\w\s]+,\s[\w\s.]+)\t?  # Job and company
    (?P<twitter>@[\w\d]+)?$  # Twitter
''', re.X|re.M)

#print(line.search(data).groupdict())
for match in line.finditer(data):
    print('{first} {last} <{email}>'.format(**match.groupdict()))

Watching the video many many times wasn't very helpful either.

Personally I would think about adding a 'hint' to the challenge question about needing to use a ^ symbol in the answer (just for the 'slow to catch on' people like me).

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

The challenge is directly after the video where we talk about nothing new other than creating a negated set. Sure, I can add a ^ hint, but that's basically handing you the answer.

As for the example code, line 15 of your above snippet has a negated set. The very one, in fact, that we used in the video just prior to this CC to stop catching the letters 'g', 'o', and 'v' in an email address' domain.

Philip Ondrejack
Philip Ondrejack
4,287 Points

I'm curious as to why this didn't work, and many variations of it...

good_numbers = re.findall(r'\d[^567]', string)

Everything that kept outputting had - what I want to call a - "ghost" 7 in it. No matter what everything kept sliding down one or rearranging. I could get 5,6 to disappear easily but not the 7.

@KennethLove -- your code did work the good_numbers = re.findall(r'[^567]', string). But why not with \d?

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

r'\d[^567]' says "find any digit, and then anything that isn't 5, 6, or 7. That first \d is what'll catch random ghost 7s.

Would you mind explaining why that is?

>>> string = '1234567890'
>>> print(re.findall(r'[ˆ567]', string))
['5', '6', '7']
>>> print(re.findall(r'\d[ˆ567]', string))
['45', '67']

Kenneth Love, still don't get why I get these answers, they don't make much sense when compared to what we learned on the videos. Any hints? Thanks!

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

I don't get those results.

>>> string
'1234567890'
>>> re.findall(r'[^567]', string)
['1', '2', '3', '4', '8', '9', '0']
>>> re.findall(r'\d[^567]', string)
['12', '34', '78', '90']
Wei Shih
Wei Shih
4,247 Points

Hi Kenneth Love, could you explain why I get this result?

>>> string = '1234567890'                                                   
>>> print(re.findall(r'\d{1}', string))                                     
['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']                          
>>> print(re.findall(r'\d{1}[^567]', string))                                     
['12', '34', '78', '90']

I expect the second statement will give me the correct answer. (as it should remove '5', '6', '7' from the first statement's result) Thanks!

Kenneth Love
Kenneth Love
Treehouse Guest Teacher

Regular expressions are very literal. Look at your pattern \d{1}[^567]. That pattern says there will be two elements: a number that is one character long, and another element that's not 5, 6, or 7.

That's why you get '78' in your matches. 7 is definitely a single digit.

Wei Shih
Wei Shih
4,247 Points

Oh, I see!

I just ignored that [^567] still represents another digit. Thanks Kenneth!

my answer : good_numbers = re.findall(r'[0123489]',string)