AWK programming lesson 2

The full syntax used in an awk program is something like


What this means is,

"For each line of input, go look and see if the PATTERN is present. If it is present, run the stuff between {}"

[If there is no pattern specified, the command gets called for EVERY line]

A specific example:

  awk '/#/ {print "Got a comment in the line"}' /etc/hosts

will print out "Got a comment" for every line that contains at least one '#', **anywhere in the line**, in /etc/hosts

The '//' bit in the pattern is one way to specify matching. THere are also other wasy to specify if a line matches. For example,

  $1 == "#" {print "got a lone, leading hash"}

will match lines that the first column is a single '#'. The '==' means an EXACT MATCH of the ENTIRE column1.

On the other hand, if you want a partial match of a particular column, use the '~' operator

  $1 ~ /#/ {print "got a hash, SOMEWHERE in column 1"}


Input of "# comment" will get matched
Input of " # comment" will ALSO get match

If specifically wanted to match "a line that begins with exactly # and a space" you should use

  /^# /  {do something}

Multiple matching

Awk will process ALL PATTERNS that match the current line. So if the following example is used,

  awk '
     /#/ {print "Got a comment"}
     $1 == "#" {print "got comment in first column"}
     /^# /  {print "Found comment at beginning"}
   ' /etc/hosts

you will get THREE printouts, for a line like
# This is a comment
TWO printouts for
  # This is an indented comment
and only one for hostname # a final comment

Keeping track of context

Not all lines are created equal, even if they look the same. Sometimes you want to do something with a line, based on lines that came before it.

Here is a quick example that prints "ADDR" lines, if you are not in a "secret" section

   awk '

   /secretstart/  	{ secret=1}
   /ADDR/		{ if(secret==0) print $0 } /* $0 is entire line */
   /secretend/		{ secret=0} '

The following will print out stuff that has "ADDR" in it, except if a "secretstart" string has been seen. ORDER MATTERS. For example, if the above was instead written as

   awk '

   /ADDR/		{ if(secret==0) print $0 } /* $0 is entire line */
   /secretstart/  	{ secret=1}

   /secretend/		{ secret=0} '

and given the following input

ADDR a normal addr
secretstart ADDR a secret addr
ADDR another secret addr
a third secret ADDR
ADDR normal too

it would PRINT OUT the first "secret" addr. Whereas the original would keep both secrets quiet.

Top of AWK lessons
bolthole main page