AWK to investigate files on Unix

Today, I worked with the Unix’ awk utility. This is an extremely potent utility to investigate text files on a Unix platform. It can be invoked from the terminal command line. The command must start with awk.

The keyword awk is followed by a script that is positioned between quotes. After the quotes, the textfile is mentioned (say ww-ii-data.txt).

When some items need to initialised, we have the begin clause. The beginclause is positioned between brackets {}.
After that a selection can be made on lines with a selection between slashes. The actions on the line are then also positioned between brackets. Finaly after the END, an end-clause may be included. We then have:

awk ‘BEGIN {} /selection/ {} END {}’ file.

As an example:

awk ' 
BEGIN {count=0;max=0}
 //{
   temp = substr($0,37,3) + 0;
   count++;
   if (max< temp)
      max=temp
        }
END {print "regels:  ", count," max in Celcius", (5/9)*(max-32);}
'   ww-ii-data.txt

I noticed that variables can be used. No declaration is needed. Nice.

An alternative programme is written on a file where columns are separated by commas. In that case, the seperator must be included in the BEGIN clause. This is accomplished with "FS="separator code"". If that is done, the different columns are labelled as $1, $2, etc. This allows you to directly access such a column. If one would like to use this columns, one may use a variable $1, $2 that stands for this column.

awk ' 
BEGIN {count=0;max=0;FS=","}
 // {
   temp = $3 + 0;
   count++;
   if (max< temp)
      max=temp
        }
END {print "regels:  ", count," max ", max;}
'   /home/hadoop/a.csv

Finally, a statement to remove end-of-line characters in a UNIX file:

 {Processed_File}

Door tom