Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Regular Expressions in Python Introduction to Regular Expressions Players Dictionary and Class

Jon C
Jon C
6,512 Points

Clarification on using ^ and $.

Unsure how and when to use ^ and $.

Example:

string = '''Love, Kenneth: 20
Chalkley, Andrew: 25
McFarland, Dave: 10
Kesten, Joy: 22
Stewart Pinchback, Pinckney Benton: 18'''

#  This is not accepted.
players = re.search(r'''
    (?P<last_name>[\w]+),?\s
    (?P<first_name>[\w]+)[,:]?\s
    (?P<score>[\d]+)?    #  $ NOT USED
    ''',string, re.X | re.M)

# This is accepted.
players = re.search(r'''
    (?P<last_name>[\w]+),?\s
    (?P<first_name>[\w]+)[,:]?\s
    (?P<score>[\d]+)?$    #  $ USED
    ''',string, re.X | re.M)

Why is $ needed but not ^?

2 Answers

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 68,423 Points

You may have discovered a bug in the challenge checker (I'm checking with Kenneth). Meanwhile....

A search pattern stops and returns a value as soon as it matches. The use of a caret (^) or dollar sign ($) signify that the matching text must include the start or end (or both) of the line. Note that, by default, the pattern matching is "greedy", in that, it will consume as many characters as it can while matching. This is why "+" for "one or more" gets the whole word or whole numeric group. At play in this pattern is the use of question mark (?) which signifies "zero or more" of the previous character or group.

By ending the pattern with "?", it says that the score field is optional. It also makes the comma between the last_name and first_name optional and a comma or colon between the first_name and the score field optional. Thus the "NOT USED" pattern

import re

string = '''Love, Kenneth: 20
Chalkley, Andrew: 25
McFarland, Dave: 10
Kesten, Joy: 22
Stewart Pinchback, Pinckney Benton: 18'''

players = re.search(r'''
    (?P<last_name>[\w]+),?\s
    (?P<first_name>[\w]+)[,:]?\s
    (?P<score>[\d]+)?    #  $ NOT USED
    ''',string, re.X | re.M)

matches and returns the false positive result:

('Stewart', 'Pinchback', '') with empty score field.

By adding the end-of-line anchor "$" makes the EOL required which forces the score field to be used in order to reach the EOL character. Note that by removing the last "?" also makes the score field required.

Possible bug: Using the EOL anchor passes the challenge, but also returns the false positive result

('Pinckney', 'Benton', '18')

I'm checking with Kenneth on that.


Using many optional elements can make it difficult to see all of the optional false positive matches.

One way to eliminate using "?" for optional fields is to expand the first_name and last_name groups to allow for the optional SPACE:

# This passes the challenge
players = re.search(r'''
    (?P<last_name>[\w ]+),\s  # optional space in last_name. comma now required
    (?P<first_name>[\w ]+):\s  # optional space in first_name, colon required
    (?P<score>[\d]+)    # score required Note "used" or "not used" '$' does not matter for this pattern
    ''', string, re.X | re.M)

Eliminating optional fields makes the pattern much easier to read and absorb the intent

Jon C
Jon C
6,512 Points

Thanks for the explanation. It's much easier to understand now.

Kourosh Raeen
Kourosh Raeen
23,733 Points

$ is not needed. The following code is accepted:

players = re.search('''
    (?P<last_name>[\w ]+),\s
    (?P<first_name>[\w ]+):\s
    (?P<score>[\d]+)?
''', string, re.X|re.M)