Python Programming for Raspberry Pi, Sams Teach Yourself in 24 Hours

(singke) #1

Working with Regular Expressions in Your Python Scripts


It helps to actually see regular expressions in use to get a feel for how to use them in your own Python
scripts. Just looking at the quirky formats doesn’t help much; seeing some examples of how regular
expressions can match real data can help clear things up!


Try It Yourself: Use a Regular Expression
Follow these steps to implement a simple phone number validator script by using
regular expressions:


  1. Determine what regular expression pattern would match the data you’re trying to
    look for. For phone numbers in the United States, there are four common ways to
    display a phone number:
    (123)456-7890
    (123) 456-7890
    123-456-7890
    123.456.7890
    This leaves four possibilities for how a customer can enter a phone number in a
    form. The regular expression must be robust enough to be able to handle any
    situation.
    When building a regular expression, it’s best to start on the left side and build the
    pattern to match the characters you might run into. In this example, there may or may
    not be a left parenthesis in the phone number. You can match this by using the
    following pattern:
    ^(?
    The caret indicates the beginning of the data. Since the left parenthesis is a special
    character, you must escape it to search for it as the character itself. The question
    mark indicates that the left parenthesis may or may not appear in the data to match.
    Next comes the three-digit area code. In the United States, area codes start with the
    number 2 through 9. (No area codes start with the digits 0 or 1.) To match the area
    code, you use this pattern:
    [2-9][0-9]{2}
    This requires that the first character be a digit between 2 and 9, followed by any two
    digits. After the area code, the ending right parenthesis may or may not be there:
    )?
    After the area code there can be a space, no space, a dash, or a dot. You can group
    these by using a character group along with the pipe symbol:
    (| |-|.)
    The very first pipe symbol appears immediately after the left parenthesis to match the
    no-space condition. You must use the escape character for the dot; otherwise, it will
    take on its special meaning and match any character.
    Next comes the three-digit phone exchange number, which doesn’t require anything
    special:
    [0-9]{3}

Free download pdf