Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python

Camille Ferré
Camille Ferré
3,330 Points

Confusion between re.match and dictionaries

Following the regular expression course, I am trying to figure everything out and there is one thing I'm having a hard time with.

From one of the videos, I understood that re.match and re.search only caught the first values and if we wanted to go through the whole data file then we need to use re.findall. Am I understanding this correctly ?

From there, I know that we can create a dictionary out of the values from the data file by using re.match and groupdict() (and we can not use findall because it returns a list, which doesn't fit with groupdict().

However, does this mean that my dictionary will only contain the first row of value of the data file since match only catches the first row ? Or is it just that when we print re.range then we see only the first row but it actually contains all the information?

How can I then access any value from this dictionary, for example if I want to print all the values that have the key 'name'. I have tried for key, value in dict_name but it still returns only the value of the first values.

Thanks for the help

1 Answer

Hi Camille,

If I understand your question correctly, you are looking for a way to extract some part of the matched string for every match found in the entire data file. For example, to find the match value corresponding to the 'name' group for all the matches. The problem is that the re.findall() function returns a list of strings and not match objects, unlike re.match() and re.serach() do, therefore there is no associated dictionary with the matches in the list. One solution is to use the re.finditer() function to successively find all matches in the data file and get a match object for each match so that you can use groupdict() method on the match objects and extract the required result of the matched groups by name.

Below is a simplified example, modified from the official Python documentation, to illustrate the point.

 >>> text = "He was carefully disguised but captured quickly by police."
 >>> for m in re.finditer(r"(?P<adjective>\w+ly)", text):
 ...     print(m.groupdict()['adjective'])
 ... 
 carefully
 quickly

I hope this helps.

Camille Ferré
Camille Ferré
3,330 Points

Thank you Attila, helps a lot