awk
awk linux command cheatsheet by Thamizhiniyan C S
Introduction
Awk is a scripting language used for manipulating data and generating reports.The awk command programming language requires no compiling, and allows the user to use variables, numeric functions, string functions, and logical operators.
Syntax
awk [flags] [select pattern/find(sort)/commands] [input file]
Program Structure
BEGIN Block
BEGIN {awk-commands}
BODY Block
/pattern/ {awk-commands}
END Block
END {awk-commands}
Important Flags
-F
With this flag you can specify FIELD SEPARATOR (FS), and thus don't need to use the BEGIN
-v
Can be used to specify variables(like we did in BEGIN{OFS=":"})
-D
You can debug your .awk scripts specifying this flag(awk -D script.awk)
-o
To specify the output file (if no name is given after the flag, the output is defaulted to awkprof.out)
f
Reads the AWK program source from the file program-file, instead of from the first command line argument
--dump-variables
Prints a sorted list of global variables and their final values to file
Built in Variables
ARGC
It implies the number of arguments provided at the command line
awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four
ARGV
It is an array that stores the command-line arguments. The array's valid index ranges from 0 to ARGC-1
awk 'BEGIN {
for (i = 0; i < ARGC - 1; ++i) {
printf "ARGV[%d] = %s\n", i, ARGV[i]
}
}' one two three four
CONVFMT
It represents the conversion format for numbers. Its default value is %.6g
awk 'BEGIN { print "Conversion Format =", CONVFMT }'
ENVIRON
It is an associative array of environment variables
awk 'BEGIN { print ENVIRON["USER"] }
FILENAME
It represents the current file name
awk 'END {print FILENAME}' marks.txt
FS
It represents the (input) field separator and its default value is space
awk 'BEGIN {print "FS = " FS}' | cat -vte
NF
It represents the number of fields in the current record
echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2'
NR
It represents the number of the current record
echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3'
FNR
It is similar to NR, but relative to the current file
-
OFMT
It represents the output format number and its default value is %.6g
awk 'BEGIN {print "OFMT = " OFMT}'
OFS
It represents the output field separator and its default value is space
awk 'BEGIN {print "OFS = " OFS}' | cat -vte
ORS
It represents the output record separator and its default value is newline
awk 'BEGIN {print "ORS = " ORS}' | cat -vte
RLENGTH
It represents the length of the string matched by match function
awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }'
RS
It represents (input) record separator and its default value is newline
awk 'BEGIN {print "RS = " RS}' | cat -vte
RSTART
It represents the first position in the string matched by match function
awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } }'
SUBSEP
It represents the separator character for array subscripts and its default value is \034
awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte
$0
It represents the entire input record
awk '{print $0}' marks.txt
$n
It represents the nth field in the current record where the fields are separated by FS
awk '{print $3 "\t" $4}' marks.txt
printf statement
awk's printf statement is the same as that in C, except that the *
format specifier is not supported.
SYNTAX: printf
format
,
expr1
,
expr2
, ...,
exprn
awk printf conversion characters
%c
single character
%d
decimal integer
%e
[-]d.dprecisionE[+-]dd
%f
[-]ddd.dprecision
%g
e or f conversion, whichever is shorter, with nonsignificant zeros suppressed
%o
unsigned octal number
%s
string
%x
unsigned hexadecimal number
%%
print a %; no argument is converted
printf examples
printf "%d", 99/2
49
printf "%e", 99/2
4.950000e+01
printf "%f", 99/2
49.500000
printf "%6.2f", 99/2
49.50
printf "%g", 99/2
49.5
printf "%o", 99/2
61
printf "%06o", 99/2
000061
printf "%x", 99/2
31
printf "|%s|", "January"
|January|
printf "|%10s|", "January"
| January|
printf "|%-10s|", "January"
|January |
printf "|%.3s|", "January"
|Jan|
printf "|%10.3s|", "January"
| Jan|
printf "|%-10.3s|", "January"
|Jan |
printf "%%"
%
Examples
awk '{print}' file.txt
To simply print a file
awk '/ctf/' file.txt
To search for a pattern inside a file
awk '{print $1,$3}' file.txt
To list the words that are at 1st and 3rd fields
awk '{print NR,$0}'
To number the lines
awk 'BEGIN {FS="o"} {print $2}' file.txt
Split based on the character 'o'
awk "BEGIN {FS='o'} {print $1,$3} END{print 'Total Rows=',NR}"
Split with the character 'o' and print the total number of characters
awk 'BEGIN {RS="o"} {print $0}' file.txt
Separate rows base with 'o'
awk 'BEGIN {OFS=":"} {print $1,$2,$3,$4}' file.txt
To specify field delimeter while outputing
awk 'BEGIN {ORS=":"} {print $0}' file.txt
To specify record delimeter while outputing
awk '!($2 && $3 && $4) {print "Not all scores are available for " $1}'
Checks if fields 2, 3, and 4 are all empty. If they are, it prints a message indicating that not all scores are available for the item identified in the first field
awk '{
if ( $2 == "" || $3 == "" || $4 == "" ) {
print "Not all scores are available for", $1;
}
}'
Checks if fields 2, 3, and 4 are all empty. If they are, it prints a message indicating that not all scores are available for the item identified in the first field
awk '{
grade="Pass"
if ( $2<50 || $3<50 || $4<50) grade="Fail"
print $1,":",grade
}'
Evaluates student scores and assigns a grade of "Pass" or "Fail" based on whether any of their scores are below 50. It then prints the student identifier along with their grade
awk '{
average = ( $2 + $3 + $4 ) / 3
if ( average >= 80 ) print $0, ":", "A"
else if ( average >= 60 ) print $0, ":", "B"
else if ( average >= 50 ) print $0, ":", "C"
else print $0, ":", "FAIL"
}'
To identify the performance grade for each student. If the average of the three scores is 80 or more, the grade is 'A'. If the average is 60 or above, but less than 80, the grade is 'B'. If the average is 50 or above, but less than 60, the grade is 'C'. Otherwise the grade is 'FAIL'.
awk '{
printf ($0)
if ( NR % 2 == 0 ) printf "\n"
else printf ";"
}'
Formats input data by printing each line as is, but inserts a newline character every two lines and a semicolon after every odd-numbered line. This formatting creates a specific structure in the output
Last updated
Was this helpful?