Hacking Google Maps and Google Earth (ExtremeTech)

(Dana P.) #1

Chapter 5 — Storing and Sharing Information 71


Listing 5-6:Reading Fixed-Width Files in Perl with Regular Expressions

open(DATA,$ARGV[0]) or die “Cannot open file: $!”;


my $reclength = 10+10+20+20+20;
my $record;


while(read(DATA,$record,$reclength))
{
my ($id,$ref,$fname,$lname,$country)
= ($record =~ m/(.{10})(.{10})(.{20})(.{20})(.{20})/);
print “ID: $id\nRef: $ref\nFirst: $fname\nLast: $lname\nCountry:
$country\n”;
}


close(DATA);


Either way of reading fixed-width files is equally effective, although with very large files you
may find that the unpack()method is more efficient.


Note that in both examples the fields contain all the data — I haven’t removed the padding
zeros or spaces. You can fix that by using the int()function to convert the numbers into an
integer; the function will automatically remove initial zeros for you because they do not add
anything to the number. For the text fields, you can use a simple regular expression (embedded
into a function for ease of use). Listing 5-7 shows the full script for this example.


Listing 5-7:Removing Padding Data

open(DATA,$ARGV[0]) or die “Cannot open file: $!”;


my $reclength = 10+10+20+20+20;


my $record;


while(read(DATA,$record,$reclength))
{
my ($id,$ref,$fname,$lname,$country) = unpack(‘a10a10a20a20a20’,$record);


$id = int($id);
$ref = int($ref);
$fname = despace($fname);
$lname = despace($lname);
$country = despace($country);

print “ID: $id\nRef: $ref\nFirst: $fname\nLast: $lname\nCountry:
$country\n”;


Continued
Free download pdf