AWK is actually a programming language which is specially designed for processing texts. The name “AWK” is derived from the family name of its authors – Alfred Aho, Peter Weinberger, and Brian Kernighan. AWK is by default available on most Linux distributions. We can check whether it is present by using the “which” command.
AWK is very simple to use. It can be used either directly from the command line or by executing a text file containing AWK commands.
To install AWK on Debian based system, use apt package manager:
sudo apt-get update sudo apt-get install gawk
On RPM based system, we can use yum package manager to install AWK as shown below:
yum install gawk
Program Structure
The program structure of AWK is as follows:
- BEGIN Block
- Body
- END Block
The following diagram depicts the program structure of AWK in a more precise manner:
BEGIN Block
This is the first section which is executed in the program however it is executed only once. BEGIN is a keyword and hence it should be in uppercase letters. This is mainly used to initialize variables. Also please note that this section is not mandatory, it is optional.
The syntax of BEGIN block is as follows:
BEGIN {awk commands}
BODY Block
This block performs three steps as follows:
Step | Purpose |
Read | Read each line from the input stream and store in its memory |
Execute | Execute AWK commands on every line. If we need to restrict this, we have to use certain patterns |
Repeat | Repeat the above two steps until the end of the file. |
The syntax of Body block is as follows:
pattern {actions}
END Block
As its name indicates, it is executed at the end of the program. Here END is a keyword hence it should be in uppercase letters. This block is also optional.
The syntax of END block is as follows:
END {awk-commands}
How to execute AWK commands?
As already mentioned above, it can be executed either from command line or by executing a text file. To specify AWK commands in command line, use the following pattern:
awk [options] file
Here the “file” specifies the file on which AWK commands are to be executed. Consider the following example for more clarification.
Example 1: Consider a text file list.txt as follows:
Suppose we need to print the entire contents of this text file (list.txt) containing the list of grocery items and its quantity. The AWK command to execute the same is as follows:
awk '{print}' list.txt
In order to execute the AWK commands which are given in a file, use the following pattern:
awk [options] -f file
Example 2:
To execute the above example by executing commands in a text file, we should initially create a text file (commands.awk) which contains the command:
{print}
Now we can instruct AWK to read commands from text file and then perform the required task. This can be achieved by using the -f option along with the awk command.
awk -f commands.awk list.txt
Some other examples of awk are as follows:
Example 3:
Print the second column of list.txt
awk '{print $2}' list.txt
Here $2 represents second column. $1, $2, $3… represents the first, second, third columns… in a row respectively. In order to print an entire row, use $0. In the above example, it prints the second column in each row.
If we need to print 2nd and 3rd columns, use the command:
awk '{print $2, $3}' list.txt
Example 4:
Use ‘if‘ command with awk
awk '{if ($1=="2") print $0;}' list.txt
Here it checks whether 1st column matches with “2”. If yes, then it prints that entire row.
Example 5:
Use ‘for‘ command with awk
awk 'BEGIN { for(i=1;i<=5;i++) print "Cube of", i, "is",i*i*i; }'
The output will be as follows:
Built-in Variables
The built-in variables used in awk are as follows:
Variables | Purpose |
0, $1, $2, … | Entire row, first column, second column, … |
FS | Input field separator |
OFS | Output field separator |
NF | Number of fields |
NR | Number of records |
I believe now you have got a basic idea about Awk and its uses. It is actually a very powerful filter. Here I have mentioned the basic features/uses only, it is even more and can only be made handy with practice.