The most common use of awk is to manipulate files that consist of fields
separated by delimiters, such as a comma-separated values (CSV) file output
from a spreadsheet program or a configuration file that assigns default values
to program variables.
You define the delimiter that awk will look for, and it then assigns an internal
awk variable to each item on a line. For example, if you have a set of
parameters and values in a file where each parameter is listed, followed by an
equals sign as the delimiter, then a value, you define this for awk in your
command, and the parameter will be assigned, for example, as $1 and the
value as $2.
Most files contain more than lists of parameters and values, though. What if
you had a comma-delimited file containing names of things on your desk, a
category for each, a color for each, and a date corresponding to the last time
you picked up each item. That is four columns: name, category, color, and
date. If you only really cared about the names and dates, you could use awk
to process the file quickly and list just these, like this:
Click here to view code image
matthew@seymour:~$ awk -F',' '{print $1, "was last picked up on",
$4}' deskstuff.txt
The output would be displayed on the screen (but could be redirected to a file)
and would contain a list of only the information you wanted. In the command
above, -F defines the delimiter, which is placed in ‘ marks, and the pair of {
} within a set of ‘ marks defines what to output, first variable 1, then the text
was last picked up on, followed by variable 4. At the end, the text
file to process is named.
You can define multiple delimiters by using [ ], like this: -F’[;,-]’. You
can adjust how to format the text that is output. You can output placeholders
when there are blank variables. You can place several awk statements in a file
and then run it as if it were a shell script.
To close this introduction, say that you have a large text file containing
delimited lists of data, and that file contains far more information than you
need. After you extract the data you need from the file, you want to replace a
subset of that data with a different value. Give yourself a minute to think at a
high level about how you might be able to process the file through awk, then
pipe it into sed, and then redirect the output to a file. Think about how long
that would take you to perform by hand or even with most programming
languages, like Python or Perl. Consider how long those programs would be
and, in the case of Perl, how difficult it might be to read it later. Now you