Finding Repeated Characters5:47 with Alena Holligan
Character repetition is one of the more powerful features of regular expressions. Learn how to match repeated characters.
Here's the serial numbers used in the video for easy copy and pasting:
E4763GHC 7896TOB3P L0003
Copy both the Match and the Exclude set of test strings from each exercise below into regex101. Using what you've learned so far, create a regular expression that will match all of the strings in the Match set and exclude the ones in the Exclude set.
8 pieces 7 piece 6 pieces 5 pieces 4 pieces
A piece A 12345
8 pieces 7 piece 6 pieces 5 pieces 4 pieces 2 pie slices
A piece A 12345
1234 5678 84753 78930
abcde abcde power bat!
123abc 333cats 821_Plants 769___
Often when constructing a regular expression, 0:00 you'll need to a way to handle repeating characters. 0:03 For example, say you wanted to merge several spaces at the end of a line or 0:06 number 10 or greater. 0:11 In other words, numbers with two or more numerical characters, 0:14 two regex characters we'll often use are the asterisk and the plus symbol. 0:19 These two symbols can both match more than one character. 0:26 However, while the asterisk can match zero or 0:30 more characters, the plus matches at least one character. 0:34 For example, let's say we wanted to match toy boat or toy car, 0:40 both start with toy but ends with the different sequence of characters. 0:45 Or start the reject with toy and 0:50 I can put a word character to match any letter that follows. 0:53 This will match the c in car or the b in boat, but 0:59 to match the rest of either string, I can use an asterisk. 1:03 This asterisk is saying I want to match zero or 1:08 more word characters, notice how toy is also matched. 1:12 That's because there are zero word characters after the y in toy, 1:18 to exclude toy from the matches, we can use a + instead. 1:23 The word character followed by a plus matches one or more word characters. 1:28 In other words we're saying there has to be at least one word character 1:35 that follows toy, and toy is now excluded from the matches. 1:41 There are also times when you want to specify the exact number of characters, 1:46 like three digits in a phone's area code or 1:52 the last four digits of a serial number. 1:55 You can be specific about how many repetitions you want to accept, 1:58 by using curly braces you can enter the number of repetitions. 2:03 Here the example matches three repetitions, by placing a comma 2:08 after the three any repetition of three or greater will be matched. 2:13 Putting a number after the comma will buy into the number of repetitions. 2:20 In this example, repetitions between 3 and 5 will be matched, 2:26 let's see these in use, let's clear out these toy examples. 2:31 Let's match a US social security number now. 2:39 Social security numbers are nine digits and are formatted into three groupings. 2:42 The first has three digits, the second group has two, and the last has four. 2:47 They're separated by hyphens, I'll enter a couple numbers now. 2:52 To avoid actually using real social security numbers, 2:59 I'm going to put zeros as the first group. 3:02 Now I'll start to build up the regular expression. 3:12 I'll enter the digit character to begin with, I can repeat this with a plus. 3:17 Which we can see will freely match all the groupings of numerals below. 3:23 Let's limit this to match exactly three characters by 3:28 replacing the + with curly braces, putting a 3 in between. 3:32 Now we're matching only the three numerical sequences the parser can find. 3:38 If I put a hyphen after, you can see 3:43 that there are only two occurrences of three digits followed by a hyphen. 3:47 Now we can specify the rest of the pattern by putting \d, 3:53 with 2 repetitions, and a hyphen, and 3:59 then \d with 4 repetitions and that's it. 4:04 Now if I delete a digit from one of these numbers, the pattern fails to match. 4:11 Let's see how to relax the number of repetitions will accept. 4:17 Suppose there's a serial number we want to match that will always be between five and 4:21 nine characters. 4:26 And those characters can only be letters or numbers. 4:27 I'll just paste a few examples that I made up that follow this pattern. 4:32 You can type your own if you want, or copy and paste from the teacher's notes. 4:38 Because we want to match letters and numbers, I could use the word character. 4:42 And then I'll follow it with curly braces, 5, 9. 4:50 Now if I erase the 3 from the end of the last number, 4:57 there are four characters and the serial number won't match. 5:02 If I add a character to the second one, 5:07 there will be 10 characters, and I'll only get a partial match. 5:10 If I then erase the 9 from the red jacks, 5:17 Any serial number five characters or more will be matched. 5:22 I'll just type a few more characters, so 5:28 you can see that they'll be matched as well. 5:30 Get some practice with these in the exercises in the teachers notes. 5:36 Next, let's look at how to match anything except a specific character. 5:40
You need to sign up for Treehouse in order to download course files.Sign up