AWK programming

Awk is essentially a stream editor, like sed. You can pipe text to it, and it can manipulate it on a line-by-line basis. [or it can read from a file]. It is also a programming language. That basically means it can do anything sed can do, and a lot more. (But you might have to type more :-)

Unlike sed, it has the ability to remember context, do comparisons, and most things another full programming language can do. For example, it isn't just limited to single lines. It can JOIN multiple lines, if you do things right.

The simplest form of awk is a one-liner:


  awk '{ do-something-here }'

The "do-something-here" can be a single print statement, or something much more complicated. It gets somewhat 'C'-like. Simple example:


  awk '{print $1,$3}'

will print out the first and third columns, where columns are defined as "Things between whitespace". (whitespace==tab or space) Complicated example:


awk '{ 	if ( $1 = "start") {
		start=1;
		print "started";
		if ( $2 != "" )	{
			print "Additional args:",$2,$3,$4,$5
		}
		continue;
	}
	if ( $1 = "end") {
		print "End of section";
		printf ("Summary: %d,%d,%d (first, second, equal)\n",
			firstcol, secondcol, tied);
		firstcol=0;
		secondcol=0;
		tied=0;
		start=0;
	}
	if ( start >0) {
		if ( $1 > $2 ) {
			firstcol= firstcol+1
		}else 
		if ( $2 > $1 ) {
			secondcol= secondcol+1
		}else
			tied=$tied+1
	}
}'

Okay, I didn't test that. It's just for discussion purposes:-) [See below for what the example does]

Key points to remember about variables:

Variables that represent field positions,aka columns, are referenced with '$'.

You can actually alter a field value. But it doesn't matter if you are using them, or setting them, you still keep the '$'

Other variables do NOT use '$'.

Like shellscripting, variables are automatically initialized to 0

What does that long example do? Well, I believe it should look at input line by line, and wait for segments between


start
end

blocks.

In-between those markers, it expects two columns of numbers. It keeps track of how many lines have the first number greater, or the second, or they are both tied. Once it hits an 'end' marker, it prints out the tally, and zeros the counters.

So here you get a good example of context.

I think that's enough for folks to digest. End of awk lesson 1.

Top of AWK lessons
Author: phil@bolthole.com
bolthole main page