Python Programming for Raspberry Pi, Sams Teach Yourself in 24 Hours

(singke) #1
>>> re.search('^[0123456789][0123456789][0123456789][0123456789][0123456789]$'
, '123456')
>>> re.search('^[0123456789][0123456789][0123456789][0123456789][0123456789]$',
'1234')
>>>

If there are fewer than five or more than five numbers in a zip code, the regular expression pattern
returns False.


Negating Character Classes


In regular expression patterns, you can reverse the effect of a character class. Instead of looking for a
character contained in a class, you can look for any character that’s not in the class. To do this, you
place a caret character at the beginning of the character class range, as shown here:


Click here to view code image


>>> re.search('[^ch]at', 'The cat is sleeping')
>>> re.search('[^ch]at', 'He is at home')
<_sre.SRE_Match object at 0x015F9988>
>>> re.search('[^ch]at', 'at the top of the hour')
>>>

By negating the character class, the regular expression pattern matches any character that’s neither a c
nor an h, along with the text pattern. Because the space character fits this category, it passes the
pattern match. However, even with the negation, the character class must still match a character, so
the line with the at in the start of the line still doesn’t match the pattern.


Using Ranges


You may have noticed in the zip code example that it is rather awkward having to list all the possible
digits in each character class. Fortunately, you can use a shortcut to avoid having to do that.


You can use a range of characters within a character class by using the dash symbol. You just specify
the first character in the range, a dash, and then the last character in the range. The regular expression
includes any character that’s within the specified character range, depending on the character set you
defined when you set up your Raspberry Pi system.


Now you can simplify the zip code example by specifying a range of digits:


Click here to view code image


>>> re.search('^[0-9][0-9][0-9][0-9][0-9]$', '12345')
<_sre.SRE_Match object at 0x01570C98>
>>> re.search('^[0-9][0-9][0-9][0-9][0-9]$', '1234')
>>> re.search('^[0-9][0-9][0-9][0-9][0-9]$', '123456')
>>>

This saves a lot of typing! Each character class matches any digit from 0 to 9. The same technique
also works with letters:


Click here to view code image


>>> re.search('[c-h]at', 'The cat is sleeping')
<_sre.SRE_Match object at 0x0154FC28>
>>> re.search('[c-h]at', "I'm getting too fat")
<_sre.SRE_Match object at 0x01570C98>
>>> re.search('[c-h]at', 'He hit the ball with the bat')
>>>
Free download pdf