Hacking Gmail

(Grace) #1

Chapter 13 — Building an API from the HTML-Only Version 189


prints the results out to the screen. You will need to save the Inbox source as


‘gmailinboxsource.html’ and save it in the same directory as this script. You’ll use
these results in a more meaningful way later.


Listing 13-3: Walking Through the Inbox with HTML::TokeParser

#!/usr/bin/perl


use warnings;
use strict;
use HTML::TokeParser;


open( FILEIN, “gmailinboxsource.html” );
undef $/;
my $filecontents = ;


my $stream = HTML::TokeParser->new( \$filecontents );


Go to the right part of the page, skipping 8 tables (!!!)


$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);
$stream->get_tag(“table”);


Now we loop through the table, getting the dates and


locations. We need to stop at the bottom of the table, so we
test for a closing /table tag.


PARSE: while ( my $tag = $stream->get_tag ) {


my $nexttag = $stream->get_tag->[0];
last PARSE if ( $nexttag eq ‘table’ );
$stream->unget_token();

my $input_tag = $stream->get_tag(“input”);
my $threadid = $input_tag->[1]{value};

my $starred = $stream->get_trimmed_text() || “Not
Starred”;


Continued
Free download pdf