This comprehensive AWK Cheat Sheet is a practical guide designed for both new and experienced users, providing essential commands, functions, and examples in an easy-to-follow format. It covers everything from basic syntax and built-in variables to more advanced topics such as error handling, text processing, and array manipulation. The cheat sheet also includes tips for using AWK in shell scripts and complex data parsing tasks. Useful links at the end provide additional resources for deepening your knowledge and skills in AWK programming. Whether you're performing simple text manipulation or complex data analysis, this cheat sheet is an invaluable resource for making the most of AWK's powerful features.
The Basics
Have a try
$ awk -F: '{print $1, $NF}' /etc/passwd
- | - | - |
---|---|---|
-F: |
Colon as a separator | |
{...} |
Awk program | |
print |
Prints the current record | |
$1 |
First field | |
$NF |
Last field | |
/etc/passwd |
Input data file |
Awk program
BEGIN {<initializations>}
<pattern 1> {<program actions>}
<pattern 2> {<program actions>}
...
END {< final actions >}
Example
awk '
BEGIN { print "\n>>>Start" }
!/(login|shutdown)/ { print NR, $0 }
END { print "<<<END\n" }
' /etc/passwd
Variables
$1 $2/$(NF-1) $3/$NF
▼ ▼ ▼
┌──────┬──────────────────┬───────┐
$0/NR ▶ │ ID │ WEBSITE │ URI │
├──────┼──────────────────┼───────┤
$0/NR ▶ │ 1 │ cheatsheets.zip │ awk │
├──────┼──────────────────┼───────┤
$0/NR ▶ │ 2 │ google.com │ 25 │
└──────┴──────────────────┴───────┘
# First and last field
awk -F: '{print $1,$NF}' /etc/passwd
# With line number
awk -F: '{print NR, $0}' /etc/passwd
# Second last field
awk -F: '{print $(NF-1)}' /etc/passwd
# Custom string
awk -F: '{print $1 "=" $6}' /etc/passwd
See: Variables
Awk program examples
awk 'BEGIN {print "hello world"}' # Prints "hello world"
awk -F: '{print $1}' /etc/passwd # -F: Specify field separator
# /pattern/ Execute actions only for matched pattern
awk -F: '/root/ {print $1}' /etc/passwd
# BEGIN block is executed once at the start
awk -F: 'BEGIN { print "uid"} { print $1 }' /etc/passwd
# END block is executed once at the end
awk -F: '{print $1} END { print "-done-"}' /etc/passwd
Arrays
awk 'BEGIN {
fruits["banana"] = "yellow";
fruits["raspberry"] = "red"
for(fruit in fruits) {
print "The color of " fruit " is " fruits[fruit]
}
}'
See: Arrays
Functions
# => 5
awk 'BEGIN{print length("hello")}'
# => HELLO
awk 'BEGIN{print toupper("hello")}'
# => hel
awk 'BEGIN{print substr("hello", 1, 3)}'
See: Functions
Awk Variables
Build-in variables
- | - |
---|---|
$0 |
Whole line |
$1, $2...$NF |
First, second… last field |
NR |
N umber of R ecords |
NF |
N umber of F ields |
OFS |
O utput F ield S eparator <br> (default " ") |
FS |
input F ield S eparator <br> (default " ") |
ORS |
O utput R ecord S eparator <br> (default "\n") |
RS |
input R ecord S eparator <br> (default "\n") |
FILENAME |
Name of the file |
Expressions
- | - |
---|---|
$1 == "root" |
First field equals root |
{print $(NF-1)} |
Second last field |
NR!=1{print $0} |
From 2nd record |
NR > 3 |
From 4th record |
NR == 1 |
First record |
END{print NR} |
Total records |
BEGIN{print OFMT} |
Output format |
{print NR, $0} |
Line number |
{print NR " " $0} |
Line number (tab) |
{$1 = NR; print} |
Replace 1st field with line number |
$NF > 4 |
Last field > 4 |
NR % 2 == 0 |
Even records |
NR==10, NR==20 |
Records 10 to 20 |
BEGIN{print ARGC} |
Total arguments |
ORS=NR%5?",":"\n" |
Concatenate records |
Examples
Print sum and average
awk -F: '{sum += $3}
END { print sum, sum/NR }
' /etc/passwd
Printing parameters
awk 'BEGIN {
for (i = 1; i < ARGC; i++)
print ARGV[i] }' a b c
Output field separator as a comma
awk 'BEGIN { FS=":";OFS=","}
{print $1,$2,$3,$4}' /etc/passwd
Position of match
awk 'BEGIN {
if (match("One Two Three", "Tw"))
print RSTART }'
Length of match
awk 'BEGIN {
if (match("One Two Three", "re"))
print RLENGTH }'
Environment Variables
- | - |
---|---|
ARGC |
Number or arguments |
ARGV |
Array of arguments |
FNR |
F ile N umber of R ecords |
OFMT |
Format for numbers <br> (default "%.6g") |
RSTART |
Location in the string |
RLENGTH |
Length of match |
SUBSEP |
Multi-dimensional array separator <br> (default "\034") |
ARGIND |
Argument Index |
GNU awk only
- | - |
---|---|
ENVIRON |
Environment variables |
IGNORECASE |
Ignore case |
CONVFMT |
Conversion format |
ERRNO |
System errors |
FIELDWIDTHS |
Fixed width fields |
Defining variable
awk -v var1="Hello" -v var2="Wold" '
END {print var1, var2}
' </dev/null
Use shell variables
awk -v varName="$PWD" '
END {print varName}' </dev/null
AWK Operators
Operators
- | - |
---|---|
{print $1} |
First field |
$2 == "foo" |
Equals |
$2 != "foo" |
Not equals |
"foo" in array |
In array |
Regular expression
- | - |
---|---|
/regex/ |
Line matches |
!/regex/ |
Line not matches |
$1 ~ /regex/ |
Field matches |
$1 !~ /regex/ |
Field not matches |
More conditions
- | - |
---|---|
($2 <= 4 || $3 < 20) |
Or |
($1 == 4 && $3 < 20) |
And |
Operations
Arithmetic operations
+
-
*
/
%
++
--
Shorthand assignments
+=
-=
*=
/=
%=
Comparison operators
==
!=
<
>
<=
>=
Examples
awk 'BEGIN {
if ("foo" ~ "^fo+$")
print "Fooey!";
}'
Not match
awk 'BEGIN {
if ("boo" !~ "^fo+$")
print "Boo!";
}'
if in array
awk 'BEGIN {
assoc["foo"] = "bar";
assoc["bar"] = "baz";
if ("foo" in assoc)
print "Fooey!";
}'
Special Characters and Field Separators
Special Characters
Handling special characters in AWK involves escaping them within strings and regular expressions to ensure they are interpreted correctly.
# Escaping special characters in regular expressions
awk '/\$100/ {print $0}' file.txt # Finds lines containing $100
awk '/path\/to\/file/ {print $0}' file.txt # Finds lines containing path/to/file
Field Separators
AWK allows the use of simple or multiple characters as field separators, which can be specified using the FS
variable or -F
option.
# Using multiple characters as field separators
awk -F",|:" '{print $1, $2}' data.txt # Fields delimited by comma or colon
awk 'BEGIN{ FS="[: ]+" } {print $1, $2}' data.txt # Fields delimited by colon or one/more spaces
- | - |
---|---|
-F",|:" |
Use comma or colon as field separator |
FS="[: ]+" |
Regex for colon or one/more spaces as separator |
'/\$100/' |
Escaping the dollar sign in regex |
'path\/to\/file' |
Escaping slashes in a file path |
Text Processing Examples
Log File Analysis
Analyze log files to extract specific information, for example, counting error messages.
# Count 'error' messages in a log file
awk '/error/ {count++} END {print count}' server.log
CSV Manipulation
Manipulate CSV files, such as filtering data and changing the format of the output.
# Filter and print specific columns from a CSV
awk -F, '{print $1, $4}' data.csv
Multiline Record Processing
Handle records that span multiple lines, typically when records are separated by a blank line.
# Set record separator to an empty line and field separator to newline
awk 'BEGIN {RS=""; FS="\n"} {print $1, $2}' multi-line-records.txt
- | - |
---|---|
/error/ |
Pattern to find 'error' messages in log files |
-F, |
Field separator set to comma for CSV files |
{count++} |
Action to count occurrences |
END {print count} |
Action performed at the end of the input |
'{print $1, $4}' |
Print the first and fourth fields from CSV |
RS="" |
Record separator set to an empty line |
FS="\n" |
Field separator set to newline |
'{print $1, $2}' |
Print the first and second fields of each record |
AWK Functions
Common functions
Function | Description |
---|---|
index(s,t) |
Position in string s where string t occurs, 0 if not found |
length(s) |
Length of string s (or $0 if no arg) |
rand |
Random number between 0 and 1 |
substr(s,index,len) |
Return len-char substring of s that begins at index (counted from 1) |
srand |
Set seed for rand and return previous seed |
int(x) |
Truncate x to integer value |
split(s,a,fs) |
Split string s into array a split by fs, returning length of a |
match(s,r) |
Position in string s where regex r occurs, or 0 if not found |
sub(r,t,s) |
Substitute t for first occurrence of regex r in string s (or $0 if s not given) |
gsub(r,t,s) |
Substitute t for all occurrences of regex r in string s |
system(cmd) |
Execute cmd and return exit status |
tolower(s) |
String s to lowercase |
toupper(s) |
String s to uppercase |
getline |
Set $0 to next input record from current input file. |
User defined function
awk '
# Returns minimum number
function find_min(num1, num2){
if (num1 < num2)
return num1
return num2
}
# Returns maximum number
function find_max(num1, num2){
if (num1 > num2)
return num1
return num2
}
# Main function
function main(num1, num2){
result = find_min(num1, num2)
print "Minimum =", result
result = find_max(num1, num2)
print "Maximum =", result
}
# Script execution starts here
BEGIN {
main(10, 60)
}
'
Error Handling
Checking for Missing Fields
Check and handle missing or empty fields to prevent script errors or incorrect outputs.
# Check if a specific field is missing and handle it
awk -F, '{if (NF < 5) print "Missing fields"; else print $5}' data.csv
Validating Data Formats
Ensure that data matches expected formats, such as date formats or numerical ranges.
# Validate date format (YYYY-MM-DD)
awk '{if ($1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}$/) print $1 " is valid"; else print $1 " is invalid"}' dates.txt
Error Logging
Log errors or unusual conditions to a separate file for debugging.
# Log errors to a file
awk 'BEGIN {OFS = FS = ","} $3 < 0 {print $0 >> "error_log.csv"}' sales.csv
- | - |
---|---|
if (NF < 5) |
Check if the number of fields is less than 5 |
print "Missing fields" |
Action to alert about missing fields |
/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/ |
Regex pattern for date format validation |
$3 < 0 |
Condition to check for negative values in the third field |
print $0 >> "error_log.csv" |
Redirect output to an error log file |
Awk Arrays
Array with index
awk 'BEGIN {
arr[0] = "foo";
arr[1] = "bar";
print(arr[0]); # => foo
delete arr[0];
print(arr[0]); # => ""
}'
Array with key
awk 'BEGIN {
assoc["foo"] = "bar";
assoc["bar"] = "baz";
print("baz" in assoc); # => 0
print("foo" in assoc); # => 1
}'
Array with split
awk 'BEGIN {
split("foo:bar:baz", arr, ":");
for (key in arr)
print arr[key];
}'
Array with asort
awk 'BEGIN {
arr[0] = 3
arr[1] = 2
arr[2] = 4
n = asort(arr)
for (i = 1; i <= n ; i++)
print(arr[i])
}'
Multi-dimensional
awk 'BEGIN {
multidim[0,0] = "foo";
multidim[0,1] = "bar";
multidim[1,0] = "baz";
multidim[1,1] = "boo";
}'
Multi-dimensional iteration
awk 'BEGIN {
array[1,2]=3;
array[2,3]=5;
for (comb in array) {
split(comb,sep,SUBSEP);
print sep[1], sep[2],
array[sep[1],sep[2]]
}
}'
Awk Conditions
if-else statement
awk -v count=2 'BEGIN {
if (count == 1)
print "Yes";
else
print "Huh?";
}'
Ternary operator
awk -v count=2 'BEGIN {
print (count==1) ? "Yes" : "Huh?";
}'
Exists
awk 'BEGIN {
assoc["foo"] = "bar";
assoc["bar"] = "baz";
if ("foo" in assoc)
print "Fooey!";
}'
Not exists
awk 'BEGIN {
assoc["foo"] = "bar";
assoc["bar"] = "baz";
if ("Huh" in assoc == 0 )
print "Huh!";
}'
switch
awk -F: '{
switch (NR * 2 + 1) {
case 3:
case "11":
print NR - 1
break
case /2[[:digit:]]+/:
print NR
default:
print NR + 1
case -1:
print NR * -1
}
}' /etc/passwd
Awk Loops
for...i
awk 'BEGIN {
for (i = 0; i < 10; i++)
print "i=" i;
}'
Powers of two between 1 and 100
awk 'BEGIN {
for (i = 1; i <= 100; i *= 2)
print i
}'
for...in
awk 'BEGIN {
assoc["key1"] = "val1"
assoc["key2"] = "val2"
for (key in assoc)
print assoc[key];
}'
Arguments
awk 'BEGIN {
for (argnum in ARGV)
print ARGV[argnum];
}' a b c
Examples
Reverse records
awk -F: '{ x[NR] = $0 }
END {
for (i = NR; i > 0; i--)
print x[i]
}
' /etc/passwd
Reverse fields
awk -F: '{
for (i = NF; i > 0; i--)
printf("%s ",$i);
print ""
}' /etc/passwd
Sum by record
awk -F: '{
s=0;
for (i = 1; i <= NF; i++)
s += $i;
print s
}' /etc/passwd
Sum whole file
awk -F: '
{for (i = 1; i <= NF; i++)
s += $i;
};
END{print s}
' /etc/passwd
while
awk 'BEGIN {
while (a < 10) {
print "- " " concatenation: " a
a++;
}
}'
do...while
awk '{
i = 1
do {
print $0
i++
} while (i <= 5)
}' /etc/passwd
Break
awk 'BEGIN {
break_num = 5
for (i = 0; i < 10; i++) {
print i
if (i == break_num)
break
}
}'
Continue
awk 'BEGIN {
for (x = 0; x <= 10; x++) {
if (x == 5 || x == 6)
continue
printf "%d ", x
}
print ""
}'
AWK Formatted Printing
Usage
Right align
awk 'BEGIN{printf "|%10s|\n", "hello"}'
| hello|
Left align
awk 'BEGIN{printf "|%-10s|\n", "hello"}'
|hello |
Common specifiers
Character | Description |
---|---|
c |
ASCII character |
d |
Decimal integer |
e , E , f |
Floating-point format |
o |
Unsigned octal value |
s |
String |
% |
Literal % |
Space
awk -F: '{
printf "%-10s %s\n", $1, $(NF-1)
}' /etc/passwd | head -n 3
Outputs
root /root
bin /bin
daemon /sbin
Header
awk -F: 'BEGIN {
printf "%-10s %s\n", "User", "Home"
printf "%-10s %s\n", "----","----"}
{ printf "%-10s %s\n", $1, $(NF-1) }
' /etc/passwd | head -n 5
Outputs
User Home
---- ----
root /root
bin /bin
daemon /sbin
Miscellaneous
Regex Metacharacters
\
^
$
.
[
]
|
(
)
*
+
?
Escape Sequences
- | - |
---|---|
\b |
Backspace |
\f |
Form feed |
\n |
Newline (line feed) |
\r |
Carriage return |
\t |
Horizontal tab |
\v |
Vertical tab |
Run script
$ cat demo.awk
#!/usr/bin/awk -f
BEGIN { x = 23 }
{ x += 2 }
END { print x }
$ awk -f demo.awk /etc/passwd
69
Useful Links
- GNU Awk User's Guide - Official documentation and comprehensive user guide for GNU Awk.
- The AWK Programming Language - A digital copy of the book by Aho, Kernighan, and Weinberger, which is an authoritative resource on AWK.
- Awk - A Tutorial and Introduction - TutorialsPoint guide offering a quick and practical introduction to AWK.
- AWK Cheatsheet - Rafe's AWK Cheatsheet