Finding Repeated Characters4:47 with Joel Kraft
Character repetition is one of the more powerful features of regular expressions. Learn how to match repeated characters.
Here's the serial numbers used in the video for easy copy and pasting:
E4763GHC 7896TOB3P L0003
Copy both the Match and the Exclude set of test strings from each exercise below into a Regex tester like regexpal or regex101. Using what you've learned so far, create a regular expression that will match all of the strings in the Match set and exclude the ones in the Exclude set.
8 pieces 7 piece 6 pieces 5 pieces 4 pieces
A piece A 12345
8 pieces 7 piece 6 pieces 5 pieces 4 pieces 2 pie slices
A piece A 12345
1234 5678 84753 78930
abcde abcde power bat!
123abc 333cats 821_Plants 769___
Often when constructing a regular expression, 0:00 you'll need a way to handle repeating characters. 0:03 For example, say you wanted to match several spaces at the end of a line or 0:06 numbers ten or greater. 0:11 In other words, numbers with two or more numerical characters. 0:13 Two regex characters you'll use often are the asterisk and the plus symbol. 0:17 These two symbols both can match more than one character. 0:22 However while the asterisk can match zero or 0:26 more characters, the plus matches at least one character. 0:29 For example, say you wanted to match toy boat or toy car. 0:34 Both start with toy but end with a different sequence of characters. 0:38 I'll start the regex with toy and 0:43 I can put a word character to match any character that follows. 0:46 This will match the c in car or the b in boat, but 0:51 to match the rest of either string, I can use an asterisk. 0:55 The asterisk is saying, I want to match zero or more word characters. 0:59 Notice how toy is also matched, 1:04 that's because there are zero word characters after the y in toy. 1:06 To exclude toy from the matches, we can use a plus instead. 1:11 A word character followed by a plus matches one or more word characters. 1:15 In other words, 1:20 we're saying there has to be at least one word character that follows toy. 1:21 And toy is now excluded from the matches. 1:26 There are also times when you want to specify an exact number of characters, 1:29 like three digits in a phone's area code or 1:34 the last four digits of a serial number. 1:37 You can be specific about how many repetitions you want to accept 1:39 by using curly braces, you can enter the number of repetitions. 1:43 Here the example matches three repetitions. 1:47 By placing a comma after the three, any repetition of three or 1:50 greater will be matched. 1:55 Putting a number after the comma will bound the number of repetitions. 1:56 In this example, repetition between three and five will match. 2:00 Let's see these in use. 2:05 Let's clear out these toy boat examples. 2:07 Let's match a US social security number now. 2:10 Social security numbers have nine digits and are formatted into three groups. 2:14 The first group has three digits. 2:19 The second has two. 2:22 And the last has four. 2:24 There separated by hyphens. 2:26 I'll enter a couple numbers now. 2:29 To avoid using real social security numbers, I'll put zeros for 2:32 the first group. 2:36 Now I'll start building up the regular expression. 2:38 I'll enter a digit character to begin with. 2:41 I can repeat this with a plus. 2:44 Which you can see will freely match all the groupings of numerals below. 2:47 Let's limit this to exactly three characters 2:52 by replacing the plus with curly braces and putting a three inside. 2:55 Now we're matching only the three numeral sequences the parser can find. 3:00 If I put a hyphen after, 3:05 you can see there are only two occurences of three digits followed by a hyphen. 3:08 Now, we can specify the rest of the pattern by putting a \d with two 3:13 repetitions, and another one with four. 3:18 Now, if I delete a digit from one of these numbers, the pattern fails to match. 3:25 Let's how to relax the number of repetitions will accept. 3:31 Suppose there is a serial number we want to match 3:35 that will always be between five and nine characters. 3:37 And those characters can be letters or numbers. 3:41 I'll just paste a few examples I made up that follow this pattern. 3:44 You can type your own if you want or copy and paste from the teacher's notes. 3:49 Because we want to match letters and numbers, I'll use the word character and 3:54 follow it with an open curly bracket. 3:59 5,9 close curly. 4:01 Now, if I erase the 3 from the end of the last number, there are four characters. 4:07 And the serial number won't be a match. 4:13 If I add a character to the second one, there will be 10 characters and 4:16 we'll only get a partial match. 4:20 If I then erase the nine from the regex, 4:23 any serial number five characters or more will be matched. 4:26 I'll just type a few more characters so 4:31 you can see they will all be included in the match. 4:33 Get some practice with these in the exercises in the teacher's notes below. 4:37 Next, let's look at how to match anything except a specific character. 4:41
You need to sign up for Treehouse in order to download course files.Sign up