Bummer! This is just a preview. You need to be signed in with a Basic account to view the entire video.
Excluding Characters3:13 with Joel Kraft
Matching any character EXCEPT a specific character can be very useful at times. Learn how.
Copy both the Match and the Exclude set of test strings from each exercise below into regexpal. Using what you've learned so far, create a regular expression that will match all of the strings in the Match set and exclude the ones in the Exclude set.
foxes jumping dogs
1234k 5784k 5784k
Match *only* the commas:
pears, apples, cherries, oranges,
Match only the text character strings from each line ('html', 'head', etc):
<html> <head> </head> <body> <div> </div> </body> </html>
You can exclude any character or characters from a match.
This can be useful for matching text between delimiters.
For comma separated values, for example,
we could match any character that is not a comma to pull delimited values out.
To match any character,
except a target character, we'll use a character set that starts with caret.
For example, to match any character except an @ symbol,
you can put an @ symbol next to a caret in a character set.
Add a dot to that to exclude dots from what's matched.
Let's try it out.
I cleared both windows out so we can start with a new example.
Let's put an email address in as test.
Now I'll match any character that's not an @ symbol.
I'll put square brackets with a caret and inside I'll put an @ symbol.
The striping means that each character is a separate complete match for
our expression, every character except the @ symbol that is.
If we want to match more than one character with our expression,
what should we do?
Do you remember?
One way would be to put a + sign after it.
Now we have two complete matches.
The parser found three characters that were not @ symbols.
Then it found an @ symbol, so toy is the first match for this expression.
After the @ symbol,
it found boat.com was a repeated string of characters that weren't @ symbols.
If I put a dot in the set, you see the string has three complete matches.
There are a couple gotchas I want to point out here.
Notice that a dot inside a set behaves differently than a dot outside a set.
Inside, it's a literal dot, while outside,
it's a special character that matches everything.
This can take some getting used to, but just remember context matters.
Next, think about what would happen if I remove the + and
put an m before the character set.
Do you think it will match the m at the end of the string,
because m isn't followed by an @ or a dot?
Let's find out.
Turns out the m is not matched.
This is because the m isn't followed by a character at all.
The character set is telling the parser that m must be followed by a character,
it just can't be an @ symbol or a dot.
So if I put a character here, say a % sign, we have a match.
Regular expressions include characters which are the opposite of the digit,
word and whitespace characters we learned earlier.
If you capitalize any of them, you get the inverse.
So \D, for example, matches any character that is not a digit.
For example, if I erase this regex and put a /W,
the @ symbol and dot are matched because they are not word characters.
If I put a D instead, each character is matched because none of them are numerals.
Check the teacher's notes for additional practice.
You need to sign up for Treehouse in order to download course files.Sign up