Regex Confusion, Good Numbers

Question

I was struggling with the "Negated Numbers" code challenge. Spolier alert: I solved the problem. But on my first attempt, it seemed like I should be using the following code:

import re

string = '1234567890'

def good_numbers(s):
    return re.findall(r'\d[^567]', s)    #all the digits except 5,6,or7

print(good_numbers(string))

...but when I tested it in the workspace, I kept getting weird results: ['12', '34', '78', '90']

It got even weirder when I tried omitting just a couple of numbers. [^56] for example yielded: ['12', '34', '67', '89']

If I take the [^567] out all together, it makes a list of the individual numbers in the string, which is what I'd expect: ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']

I got the correct answer. I understand that I need to >>>SPOILER<<< omit the \d altogether. But I am at a loss to understand the behavior I described above. Can anyone clear that up for me? What am I asking python to do here?

Answer 1 · 2020-07-06T16:47:16Z

July 6, 2020 4:47pm

I have a few theories but it's tricky for me to test without a link to the code challenge.

The different results suggests to meit's a massive on inclusivity and exclusivity, that is where regex draws the line as to which of the digits in the range to negate. But it's a while since I did anything with regex..

As i understand it removing the \d operator and using the hard brackets only omits the numbers in them rather than say you used something like [5-7].

Answer 2 · 2021-06-25T20:14:49Z

June 25, 2021 8:14pm

Ok, so a year later after this was posted, now I'm struggling with the same. I was able to solve the challenge with

good_numbers = re.findall(r'\s*[^567]', string)

However, I have no idea why \s would work instead of \d. If \d would tell re.findall to find all the digits between 0 and 9, with the exception [^567], why did it keep finding the entire string 1234567890.

Using re.findall(r'[\d][^890]', string), for example, I was able to ignore the latter part of the string, but never the middle part of it. I've also tried with r'[\d][^890][\d]*', but it would find 1234567890 again.

Can anyone explain what is going on here??

Welcome to the Treehouse Community

Looking to learn something new?

Alex Golden

Alex Golden

Regex Confusion, Good Numbers

Alex Golden

Alex Golden

2 Answers

Jonathan Grieve

Jonathan Grieve

Alex Golden

Alex Golden

Jonathan Grieve

Jonathan Grieve

Gustavo Guzmán

Gustavo Guzmán