characters from the character class is in the data stream, it matches the pattern.
To define a character class, you use square brackets. The brackets contain any character you want to
include in the class. You then use the entire class within a pattern, just as you would any other
wildcard character. This takes a little getting used to, but once you catch on, you see that you can use
it to create some pretty amazing results.
Here’s an example of creating a character class:
Click here to view code image
>>> re.search('[ch]at', 'The cat is sleeping')
<_sre.SRE_Match object at 0x015F9918>
>>> re.search('[ch]at', 'That is a very nice hat')
<_sre.SRE_Match object at 0x015F99F8>
>>> re.search('[ch]at', 'He is at the store')
>>>
This time, the regular expression pattern matches only strings that have a c or h in front of the at
pattern.
You can use more than one character class in a single expression, as in these examples:
Click here to view code image
>>> re.search('[Yy][Ee][Ss]', 'Yes')
<_sre.SRE_Match object at 0x015F9988>
>>> re.search('[Yy][Ee][Ss]', 'yEs')
<_sre.SRE_Match object at 0x015F99F8>
>>> re.search('[Yy][Ee][Ss]', 'yeS')
<_sre.SRE_Match object at 0x015F9988>
>>>
The regular expression uses three character classes to cover both lowercase and uppercase for all
three character positions.
Character classes don’t have to be just letters. You can use numbers in them as well, as shown here:
Click here to view code image
>>> re.search('[012]', 'This has 1 number')
<_sre.SRE_Match object at 0x015F99F8>
>>> re.search('[012]', 'This has the number 2')
<_sre.SRE_Match object at 0x015F9988>
>>> re.search('[012]', 'This has the number 4')
>>>
The regular expression pattern matches any lines that contain the numbers 0 , 1 , or 2. Any other
numbers are ignored, as are lines without numbers in them.
This is a great feature for checking for properly formatted numbers, such as phone numbers and zip
codes. However, remember that the regular expression pattern can be found anywhere in the text of
the data stream. There might be additional characters besides the matching pattern characters.
For example, if you want to match against a five-digit zip code, you can ensure that you only match
against five numbers by using the start- and end-of-the-line characters:
Click here to view code image
>>> re.search('^[0123456789][0123456789][0123456789][0123456789][0123456789]$'
, '12345')
<_sre.SRE_Match object at 0x0154FC28>