Excluding Characters3:13 with Joel Kraft
Matching any character EXCEPT a specific character can be very useful at times. Learn how.
Copy both the Match and the Exclude set of test strings from each exercise below into a Regex tester like regexpal or regex101. Using what you've learned so far, create a regular expression that will match all of the strings in the Match set and exclude the ones in the Exclude set.
foxes jumping dogs
1234k 5784k 5784k
Match *only* the commas:
pears, apples, cherries, oranges,
Match only the text character strings from each line ('html', 'head', etc):
<html> <head> </head> <body> <div> </div> </body> </html>
You can exclude any character or characters from a match. 0:00 This can be useful for matching text between delimiters. 0:04 For comma separated values, for example, 0:08 we could match any character that is not a comma to pull delimited values out. 0:11 To match any character, 0:16 except a target character, we'll use a character set that starts with caret. 0:18 For example, to match any character except an @ symbol, 0:23 you can put an @ symbol next to a caret in a character set. 0:26 Add a dot to that to exclude dots from what's matched. 0:30 Let's try it out. 0:34 I cleared both windows out so we can start with a new example. 0:35 Let's put an email address in as test. 0:39 Say, firstname.lastname@example.org. 0:42 Now I'll match any character that's not an @ symbol. 0:47 I'll put square brackets with a caret and inside I'll put an @ symbol. 0:51 The striping means that each character is a separate complete match for 0:57 our expression, every character except the @ symbol that is. 1:01 If we want to match more than one character with our expression, 1:05 what should we do? 1:09 Do you remember? 1:10 One way would be to put a + sign after it. 1:11 Now we have two complete matches. 1:14 The parser found three characters that were not @ symbols. 1:17 Then it found an @ symbol, so toy is the first match for this expression. 1:20 After the @ symbol, 1:26 it found boat.com was a repeated string of characters that weren't @ symbols. 1:27 If I put a dot in the set, you see the string has three complete matches. 1:33 There are a couple gotchas I want to point out here. 1:39 Notice that a dot inside a set behaves differently than a dot outside a set. 1:42 Inside, it's a literal dot, while outside, 1:48 it's a special character that matches everything. 1:51 This can take some getting used to, but just remember context matters. 1:54 Next, think about what would happen if I remove the + and 1:59 put an m before the character set. 2:03 Do you think it will match the m at the end of the string, 2:06 because m isn't followed by an @ or a dot? 2:09 Let's find out. 2:12 Turns out the m is not matched. 2:15 This is because the m isn't followed by a character at all. 2:17 The character set is telling the parser that m must be followed by a character, 2:22 it just can't be an @ symbol or a dot. 2:27 So if I put a character here, say a % sign, we have a match. 2:30 Regular expressions include characters which are the opposite of the digit, 2:35 word and whitespace characters we learned earlier. 2:39 If you capitalize any of them, you get the inverse. 2:43 So \D, for example, matches any character that is not a digit. 2:46 For example, if I erase this regex and put a /W, 2:53 the @ symbol and dot are matched because they are not word characters. 2:57 If I put a D instead, each character is matched because none of them are numerals. 3:04 Check the teacher's notes for additional practice. 3:10
You need to sign up for Treehouse in order to download course files.Sign up