Bummer! This is just a preview. You need to be signed in with a Basic account to view the entire video.
Excluding Characters3:53 with Alena Holligan
Matching any character EXCEPT a specific character can be very useful at times. Learn how.
Copy both the Match and the Exclude set of test strings from each exercise below into regex101. Using what you've learned so far, create a regular expression that will match all of the strings in the Match set and exclude the ones in the Exclude set.
foxes jumping dogs
1234k 5784k 5784k
Match *only* the commas:
pears, apples, cherries, oranges,
Match only the text character strings from each line ('html', 'head', etc):
<html> <head> </head> <body> <div> </div> </body> </html>
You can exclude any character or characters from a match.
This can be useful for matching text between the limiters.
For comma separated values, for example,
we could match any character that is not a comma, to pull the limited values out.
To match any character except a target character,
we use a character set that starts with a carat.
For example, to match any character except an @ symbol,
you can put an @ symbol next to a carat in a character set.
Add a dot to that to exclude dots from what is matched.
Let's try this out.
I cleared both windows so that we can start with a new example.
Let's put an email address in as a test, say, email@example.com.
Now I'll match any character that is not an @ symbol, not at.
I'll put square brackets with a carat and an @ symbol.
This striping means that each character is a separate, complete match for
Every character except for the @ symbol, that is.
If we want to match more than just one character with our expression,
what should we do?
Do you remember?
One way would be to put a plus after it.
Now we have two complete matches.
The parser found three characters that were not @ symbols,
then it found an @ symbol.
So toy is the first match for this expression.
After the @ symbol it found boat.com was a repeated string of
characters that weren't assembled.
If I put a dot in the set, You see the string has three complete matches.
There are a couple of gotchas I want to point out here.
Notice that a dot inside a set behaves differently than a dot outside a set.
Inside it's a little dot,
while outside it's a special character that matches everything.
This can take some getting used to, but just remember, context matters.
Next think about what would happen if I removed the + and
put an m before the character set.
Do you think it will match the m at the end of the string?
Because m isn't followed by an @ or a dot.
Turns out, the m is not matched.
This is because the m isn't followed by any character at all.
The character set is telling the parser that m must be followed by a character.
It just can't be an @ symbol or a dot.
So if I put a character here, say a %, we have a match.
Regular expressions include characters, which are the opposite of the digit,
word and whitespace characters that we learned earlier.
If you capitalize any of them, you get the inverse.
So, \D, for example,
matches any character that is not a digit.
For example, if I erase the regex, And
put a \W, the @ symbol and the dot are matched,
because they're not word characters.
If I put a capital D instead,
each character is matched because none of them are numerals.
Check the teacher's notes for additional practice.
You need to sign up for Treehouse in order to download course files.Sign up