1 00:00:00,680 --> 00:00:03,450 Often when constructing a regular expression, 2 00:00:03,450 --> 00:00:06,470 you'll need a way to handle repeating characters. 3 00:00:06,470 --> 00:00:11,530 For example, say you wanted to match several spaces at the end of a line or 4 00:00:11,530 --> 00:00:13,110 numbers ten or greater. 5 00:00:13,110 --> 00:00:17,380 In other words, numbers with two or more numerical characters. 6 00:00:17,380 --> 00:00:22,940 Two regex characters you'll use often are the asterisk and the plus symbol. 7 00:00:22,940 --> 00:00:26,570 These two symbols both can match more than one character. 8 00:00:26,570 --> 00:00:29,980 However while the asterisk can match zero or 9 00:00:29,980 --> 00:00:34,480 more characters, the plus matches at least one character. 10 00:00:34,480 --> 00:00:38,770 For example, say you wanted to match toy boat or toy car. 11 00:00:38,770 --> 00:00:43,690 Both start with toy but end with a different sequence of characters. 12 00:00:43,690 --> 00:00:46,450 I'll start the regex with toy and 13 00:00:46,450 --> 00:00:51,170 I can put a word character to match any character that follows. 14 00:00:51,170 --> 00:00:55,060 This will match the c in car or the b in boat, but 15 00:00:55,060 --> 00:00:59,110 to match the rest of either string, I can use an asterisk. 16 00:00:59,110 --> 00:01:04,190 The asterisk is saying, I want to match zero or more word characters. 17 00:01:04,190 --> 00:01:06,690 Notice how toy is also matched, 18 00:01:06,690 --> 00:01:10,420 that's because there are zero word characters after the y in toy. 19 00:01:11,450 --> 00:01:15,900 To exclude toy from the matches, we can use a plus instead. 20 00:01:15,900 --> 00:01:20,830 A word character followed by a plus matches one or more word characters. 21 00:01:20,830 --> 00:01:21,530 In other words, 22 00:01:21,530 --> 00:01:26,310 we're saying there has to be at least one word character that follows toy. 23 00:01:26,310 --> 00:01:29,580 And toy is now excluded from the matches. 24 00:01:29,580 --> 00:01:34,260 There are also times when you want to specify an exact number of characters, 25 00:01:34,260 --> 00:01:37,000 like three digits in a phone's area code or 26 00:01:37,000 --> 00:01:39,670 the last four digits of a serial number. 27 00:01:39,670 --> 00:01:43,710 You can be specific about how many repetitions you want to accept 28 00:01:43,710 --> 00:01:47,760 by using curly braces, you can enter the number of repetitions. 29 00:01:47,760 --> 00:01:50,950 Here the example matches three repetitions. 30 00:01:50,950 --> 00:01:55,280 By placing a comma after the three, any repetition of three or 31 00:01:55,280 --> 00:01:56,870 greater will be matched. 32 00:01:56,870 --> 00:02:00,890 Putting a number after the comma will bound the number of repetitions. 33 00:02:00,890 --> 00:02:05,290 In this example, repetition between three and five will match. 34 00:02:05,290 --> 00:02:07,130 Let's see these in use. 35 00:02:07,130 --> 00:02:09,309 Let's clear out these toy boat examples. 36 00:02:10,390 --> 00:02:14,380 Let's match a US social security number now. 37 00:02:14,380 --> 00:02:19,350 Social security numbers have nine digits and are formatted into three groups. 38 00:02:19,350 --> 00:02:22,240 The first group has three digits. 39 00:02:22,240 --> 00:02:24,390 The second has two. 40 00:02:24,390 --> 00:02:26,650 And the last has four. 41 00:02:26,650 --> 00:02:29,100 There separated by hyphens. 42 00:02:29,100 --> 00:02:30,510 I'll enter a couple numbers now. 43 00:02:32,440 --> 00:02:36,770 To avoid using real social security numbers, I'll put zeros for 44 00:02:36,770 --> 00:02:38,540 the first group. 45 00:02:38,540 --> 00:02:41,700 Now I'll start building up the regular expression. 46 00:02:41,700 --> 00:02:44,690 I'll enter a digit character to begin with. 47 00:02:44,690 --> 00:02:47,070 I can repeat this with a plus. 48 00:02:47,070 --> 00:02:51,170 Which you can see will freely match all the groupings of numerals below. 49 00:02:52,360 --> 00:02:55,750 Let's limit this to exactly three characters 50 00:02:55,750 --> 00:03:00,950 by replacing the plus with curly braces and putting a three inside. 51 00:03:00,950 --> 00:03:05,770 Now we're matching only the three numeral sequences the parser can find. 52 00:03:05,770 --> 00:03:08,010 If I put a hyphen after, 53 00:03:08,010 --> 00:03:13,290 you can see there are only two occurences of three digits followed by a hyphen. 54 00:03:13,290 --> 00:03:18,954 Now, we can specify the rest of the pattern by putting a \d with two 55 00:03:18,954 --> 00:03:25,050 repetitions, and another one with four. 56 00:03:25,050 --> 00:03:31,230 Now, if I delete a digit from one of these numbers, the pattern fails to match. 57 00:03:31,230 --> 00:03:35,130 Let's how to relax the number of repetitions will accept. 58 00:03:35,130 --> 00:03:37,940 Suppose there is a serial number we want to match 59 00:03:37,940 --> 00:03:41,500 that will always be between five and nine characters. 60 00:03:41,500 --> 00:03:44,780 And those characters can be letters or numbers. 61 00:03:44,780 --> 00:03:48,290 I'll just paste a few examples I made up that follow this pattern. 62 00:03:49,600 --> 00:03:53,410 You can type your own if you want or copy and paste from the teacher's notes. 63 00:03:54,440 --> 00:03:59,580 Because we want to match letters and numbers, I'll use the word character and 64 00:03:59,580 --> 00:04:01,658 follow it with an open curly bracket. 65 00:04:01,658 --> 00:04:07,030 5,9 close curly. 66 00:04:07,030 --> 00:04:13,890 Now, if I erase the 3 from the end of the last number, there are four characters. 67 00:04:13,890 --> 00:04:16,520 And the serial number won't be a match. 68 00:04:16,520 --> 00:04:20,975 If I add a character to the second one, there will be 10 characters and 69 00:04:20,975 --> 00:04:23,066 we'll only get a partial match. 70 00:04:23,066 --> 00:04:26,069 If I then erase the nine from the regex, 71 00:04:26,069 --> 00:04:30,620 any serial number five characters or more will be matched. 72 00:04:31,710 --> 00:04:33,630 I'll just type a few more characters so 73 00:04:33,630 --> 00:04:36,290 you can see they will all be included in the match. 74 00:04:37,290 --> 00:04:41,690 Get some practice with these in the exercises in the teacher's notes below. 75 00:04:41,690 --> 00:04:46,570 Next, let's look at how to match anything except a specific character.