[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

Example 19-6. PP4E\Lang\cheader.py


"Scan C header files to extract parts of #define and #include lines"


import sys, re
pattDefine = re.compile( # compile to pattobj
'^#[\t ]define[\t ]+(\w+)[\t ](.*)') # "# define xxx yyy..."


\w like [a-zA-Z0-9_]


pattInclude = re.compile(
'^#[\t ]*include[\t ]+<"') # "# include ..."


def scan(fileobj):
count = 0
for line in fileobj: # scan by lines: iterator
count += 1
matchobj = pattDefine.match(line) # None if match fails
if matchobj:
name = matchobj.group(1) # substrings for (...) parts
body = matchobj.group(2)
print(count, 'defined', name, '=', body.strip())
continue
matchobj = pattInclude.match(line)
if matchobj:
start, stop = matchobj.span(1) # start/stop indexes of (...)
filename = line[start:stop] # slice out of line
print(count, 'include', filename) # same as matchobj.group(1)


if len(sys.argv) == 1:
scan(sys.stdin) # no args: read stdin
else:
scan(open(sys.argv[1], 'r')) # arg: input filename


To test, let’s run this script on the text file in Example 19-7.


Example 19-7. PP4E\Lang\test.h


#ifndef TEST_H
#define TEST_H


#include <stdio.h>
#include <lib/spam.h>


include "Python.h"


#define DEBUG
#define HELLO 'hello regex world'


define SPAM 1234


#define EGGS sunny + side + up
#define ADDER(arg) 123 + arg
#endif


Notice the spaces after # in some of these lines; regular expressions are flexible enough
to account for such departures from the norm. Here is the script at work; picking out
#include and #define lines and their parts. For each matched line, it prints the line
number, the line type, and any matched substrings:


1428 | Chapter 19: Text and Language

Free download pdf