AWK cheat sheet

Search in this cheat sheet:

This comprehensive AWK Cheat Sheet is a practical guide designed for both new and experienced users, providing essential commands, functions, and examples in an easy-to-follow format. It covers everything from basic syntax and built-in variables to more advanced topics such as error handling, text processing, and array manipulation. The cheat sheet also includes tips for using AWK in shell scripts and complex data parsing tasks. Useful links at the end provide additional resources for deepening your knowledge and skills in AWK programming. Whether you're performing simple text manipulation or complex data analysis, this cheat sheet is an invaluable resource for making the most of AWK's powerful features.

The Basics

Have a try

$ awk -F: '{print $1, $NF}' /etc/passwd

-	-	-
	`-F:`	Colon as a separator
	`{...}`	Awk program
	`print`	Prints the current record
	`$1`	First field
	`$NF`	Last field
	`/etc/passwd`	Input data file

Awk program

BEGIN          {<initializations>} 
   <pattern 1> {<program actions>} 
   <pattern 2> {<program actions>} 
   ...
END            {< final actions >}

Example

awk '
    BEGIN { print "\n>>>Start" }
    !/(login|shutdown)/ { print NR, $0 }
    END { print "<<<END\n" }
' /etc/passwd

Variables

          $1      $2/$(NF-1)    $3/$NF
           ▼          ▼           ▼ 
        ┌──────┬──────────────────┬───────┐
$0/NR ▶ │  ID  │  WEBSITE         │  URI  │
        ├──────┼──────────────────┼───────┤
$0/NR ▶ │  1   │  cheatsheets.zip │  awk  │
        ├──────┼──────────────────┼───────┤
$0/NR ▶ │  2   │  google.com      │  25   │
        └──────┴──────────────────┴───────┘

# First and last field
awk -F: '{print $1,$NF}' /etc/passwd

# With line number
awk -F: '{print NR, $0}' /etc/passwd

# Second last field
awk -F: '{print $(NF-1)}' /etc/passwd

# Custom string 
awk -F: '{print $1 "=" $6}' /etc/passwd

See: Variables

Awk program examples

awk 'BEGIN {print "hello world"}'        # Prints "hello world"
awk -F: '{print $1}' /etc/passwd         # -F: Specify field separator

# /pattern/ Execute actions only for matched pattern
awk -F: '/root/ {print $1}' /etc/passwd                     

# BEGIN block is executed once at the start
awk -F: 'BEGIN { print "uid"} { print $1 }' /etc/passwd     

# END block is executed once at the end
awk -F: '{print $1} END { print "-done-"}' /etc/passwd

Conditions

awk -F: '$3>30 {print $1}' /etc/passwd

See: Conditions

Generate 100 spaces

awk 'BEGIN{
    while (a++ < 100)
        s=s " ";
    print s
}'

See: Loops

Arrays

awk 'BEGIN {
   fruits["banana"] = "yellow";
   fruits["raspberry"] = "red"
   for(fruit in fruits) {
     print "The color of " fruit " is " fruits[fruit]
   }
}'

See: Arrays

Functions

# => 5
awk 'BEGIN{print length("hello")}'
# => HELLO
awk 'BEGIN{print toupper("hello")}'
# => hel
awk 'BEGIN{print substr("hello", 1, 3)}'

See: Functions

Awk Variables

Build-in variables

-	-
`$0`	Whole line
`$1, $2...$NF`	First, second… last field
`NR`	`N`umber of `R`ecords
`NF`	`N`umber of `F`ields
`OFS`	`O`utput `F`ield `S`eparator <br> (default " ")
`FS`	input `F`ield `S`eparator <br> (default " ")
`ORS`	`O`utput `R`ecord `S`eparator <br> (default "\n")
`RS`	input `R`ecord `S`eparator <br> (default "\n")
`FILENAME`	Name of the file

Expressions

-	-
`$1 == "root"`	First field equals root
`{print $(NF-1)}`	Second last field
`NR!=1{print $0}`	From 2nd record
`NR > 3`	From 4th record
`NR == 1`	First record
`END{print NR}`	Total records
`BEGIN{print OFMT}`	Output format
`{print NR, $0}`	Line number
`{print NR " " $0}`	Line number (tab)
`{$1 = NR; print}`	Replace 1st field with line number
`$NF > 4`	Last field > 4
`NR % 2 == 0`	Even records
`NR==10, NR==20`	Records 10 to 20
`BEGIN{print ARGC}`	Total arguments
`ORS=NR%5?",":"\n"`	Concatenate records

Examples

Print sum and average

awk -F: '{sum += $3}
     END { print sum, sum/NR }
' /etc/passwd

Printing parameters

awk 'BEGIN {
    for (i = 1; i < ARGC; i++)
        print ARGV[i] }' a b c

Output field separator as a comma

awk 'BEGIN { FS=":";OFS=","}
    {print $1,$2,$3,$4}' /etc/passwd

Position of match

awk 'BEGIN {
    if (match("One Two Three", "Tw"))
        print RSTART }'

Length of match

awk 'BEGIN {
    if (match("One Two Three", "re"))
        print RLENGTH }'

Environment Variables

-	-
`ARGC`	Number or arguments
`ARGV`	Array of arguments
`FNR`	`F`ile `N`umber of `R`ecords
`OFMT`	Format for numbers <br> (default "%.6g")
`RSTART`	Location in the string
`RLENGTH`	Length of match
`SUBSEP`	Multi-dimensional array separator <br> (default "\034")
`ARGIND`	Argument Index

GNU awk only

-	-
`ENVIRON`	Environment variables
`IGNORECASE`	Ignore case
`CONVFMT`	Conversion format
`ERRNO`	System errors
`FIELDWIDTHS`	Fixed width fields

Defining variable

awk -v var1="Hello" -v var2="Wold" '
    END {print var1, var2}
' </dev/null

Use shell variables

awk -v varName="$PWD" '
    END {print varName}' </dev/null

AWK Operators

Operators

-	-
`{print $1}`	First field
`$2 == "foo"`	Equals
`$2 != "foo"`	Not equals
`"foo" in array`	In array

Regular expression

-	-
`/regex/`	Line matches
`!/regex/`	Line not matches
`$1 ~ /regex/`	Field matches
`$1 !~ /regex/`	Field not matches

More conditions

-	-
`($2 <= 4 \|\| $3 < 20)`	Or
`($1 == 4 && $3 < 20)`	And

Operations

Arithmetic operations

+
-
*
/
%
++
--

Shorthand assignments

+=
-=
*=
/=
%=

Comparison operators

==
!=
<
>
<=
>=

Examples

awk 'BEGIN {
    if ("foo" ~ "^fo+$")
        print "Fooey!";
}'

Not match

awk 'BEGIN {
    if ("boo" !~ "^fo+$")
        print "Boo!";
}'

if in array

awk 'BEGIN {
    assoc["foo"] = "bar";
    assoc["bar"] = "baz";
    if ("foo" in assoc)
        print "Fooey!";
}'

Special Characters and Field Separators

Special Characters

Handling special characters in AWK involves escaping them within strings and regular expressions to ensure they are interpreted correctly.

# Escaping special characters in regular expressions
awk '/\$100/ {print $0}' file.txt    # Finds lines containing $100
awk '/path\/to\/file/ {print $0}' file.txt  # Finds lines containing path/to/file

Field Separators

AWK allows the use of simple or multiple characters as field separators, which can be specified using the FS variable or -F option.

# Using multiple characters as field separators
awk -F",|:" '{print $1, $2}' data.txt    # Fields delimited by comma or colon
awk 'BEGIN{ FS="[: ]+" } {print $1, $2}' data.txt  # Fields delimited by colon or one/more spaces

-	-
`-F",\|:"`	Use comma or colon as field separator
`FS="[: ]+"`	Regex for colon or one/more spaces as separator
`'/\$100/'`	Escaping the dollar sign in regex
`'path\/to\/file'`	Escaping slashes in a file path

Text Processing Examples

Log File Analysis

Analyze log files to extract specific information, for example, counting error messages.

# Count 'error' messages in a log file
awk '/error/ {count++} END {print count}' server.log

CSV Manipulation

Manipulate CSV files, such as filtering data and changing the format of the output.

# Filter and print specific columns from a CSV
awk -F, '{print $1, $4}' data.csv

Multiline Record Processing

Handle records that span multiple lines, typically when records are separated by a blank line.

# Set record separator to an empty line and field separator to newline
awk 'BEGIN {RS=""; FS="\n"} {print $1, $2}' multi-line-records.txt

-	-
`/error/`	Pattern to find 'error' messages in log files
`-F,`	Field separator set to comma for CSV files
`{count++}`	Action to count occurrences
`END {print count}`	Action performed at the end of the input
`'{print $1, $4}'`	Print the first and fourth fields from CSV
`RS=""`	Record separator set to an empty line
`FS="\n"`	Field separator set to newline
`'{print $1, $2}'`	Print the first and second fields of each record

AWK Functions

Common functions

Function	Description
`index(s,t)`	Position in string s where string t occurs, 0 if not found
`length(s)`	Length of string s (or $0 if no arg)
`rand`	Random number between 0 and 1
`substr(s,index,len)`	Return len-char substring of s that begins at index (counted from 1)
`srand`	Set seed for rand and return previous seed
`int(x)`	Truncate x to integer value
`split(s,a,fs)`	Split string s into array a split by fs, returning length of a
`match(s,r)`	Position in string s where regex r occurs, or 0 if not found
`sub(r,t,s)`	Substitute t for first occurrence of regex r in string s (or $0 if s not given)
`gsub(r,t,s)`	Substitute t for all occurrences of regex r in string s
`system(cmd)`	Execute cmd and return exit status
`tolower(s)`	String s to lowercase
`toupper(s)`	String s to uppercase
`getline`	Set $0 to next input record from current input file.

User defined function

awk '
    # Returns minimum number
    function find_min(num1, num2){
       if (num1 < num2)
       return num1
       return num2
    }
    # Returns maximum number
    function find_max(num1, num2){
       if (num1 > num2)
       return num1
       return num2
    }
    # Main function
    function main(num1, num2){
       result = find_min(num1, num2)
       print "Minimum =", result

       result = find_max(num1, num2)
       print "Maximum =", result
    }
    # Script execution starts here
    BEGIN {
       main(10, 60)
    }
'

Error Handling

Checking for Missing Fields

Check and handle missing or empty fields to prevent script errors or incorrect outputs.

# Check if a specific field is missing and handle it
awk -F, '{if (NF < 5) print "Missing fields"; else print $5}' data.csv

Validating Data Formats

Ensure that data matches expected formats, such as date formats or numerical ranges.

# Validate date format (YYYY-MM-DD)
awk '{if ($1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2}$/) print $1 " is valid"; else print $1 " is invalid"}' dates.txt

Error Logging

Log errors or unusual conditions to a separate file for debugging.

# Log errors to a file
awk 'BEGIN {OFS = FS = ","} $3 < 0 {print $0 >> "error_log.csv"}' sales.csv

-	-
`if (NF < 5)`	Check if the number of fields is less than 5
`print "Missing fields"`	Action to alert about missing fields
`/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/`	Regex pattern for date format validation
`$3 < 0`	Condition to check for negative values in the third field
`print $0 >> "error_log.csv"`	Redirect output to an error log file

Awk Arrays

Array with index

awk 'BEGIN {
    arr[0] = "foo";
    arr[1] = "bar";
    print(arr[0]); # => foo
    delete arr[0];
    print(arr[0]); # => ""
}'

Array with key

awk 'BEGIN {
    assoc["foo"] = "bar";
    assoc["bar"] = "baz";
    print("baz" in assoc); # => 0
    print("foo" in assoc); # => 1
}'

Array with split

awk 'BEGIN {
    split("foo:bar:baz", arr, ":");
    for (key in arr)
        print arr[key];
}'

Array with asort

awk 'BEGIN {
    arr[0] = 3
    arr[1] = 2
    arr[2] = 4
    n = asort(arr)
    for (i = 1; i <= n ; i++)
        print(arr[i])
}'

Multi-dimensional

awk 'BEGIN {
    multidim[0,0] = "foo";
    multidim[0,1] = "bar";
    multidim[1,0] = "baz";
    multidim[1,1] = "boo";
}'

Multi-dimensional iteration

awk 'BEGIN {
    array[1,2]=3;
    array[2,3]=5;
    for (comb in array) {
        split(comb,sep,SUBSEP);
        print sep[1], sep[2], 
        array[sep[1],sep[2]]
    }
}'

Awk Conditions

if-else statement

awk -v count=2 'BEGIN {
    if (count == 1)
        print "Yes";
    else
        print "Huh?";
}'

Ternary operator

awk -v count=2 'BEGIN {
    print (count==1) ? "Yes" : "Huh?";
}'

Exists

awk 'BEGIN {
    assoc["foo"] = "bar";
    assoc["bar"] = "baz";
    if ("foo" in assoc)
        print "Fooey!";
}'

Not exists

awk 'BEGIN {
    assoc["foo"] = "bar";
    assoc["bar"] = "baz";
    if ("Huh" in assoc == 0 )
        print "Huh!";
}'

switch

awk -F: '{
    switch (NR * 2 + 1) {
        case 3:
        case "11":
            print NR - 1
            break

        case /2[[:digit:]]+/:
            print NR

        default:
            print NR + 1

        case -1:
            print NR * -1
    }
}' /etc/passwd

Awk Loops

for...i

awk 'BEGIN {
    for (i = 0; i < 10; i++)
        print "i=" i;
}'

Powers of two between 1 and 100

awk 'BEGIN {
    for (i = 1; i <= 100; i *= 2)
        print i
}'

for...in

awk 'BEGIN {
    assoc["key1"] = "val1"
    assoc["key2"] = "val2"
    for (key in assoc)
        print assoc[key];
}'

Arguments

awk 'BEGIN {
    for (argnum in ARGV)
        print ARGV[argnum];
}' a b c

Examples

Reverse records

awk -F: '{ x[NR] = $0 }
    END {
        for (i = NR; i > 0; i--)
        print x[i]
    }
' /etc/passwd

Reverse fields

awk -F: '{
    for (i = NF; i > 0; i--)
        printf("%s ",$i);
    print ""
}' /etc/passwd

Sum by record

awk -F: '{
    s=0;
    for (i = 1; i <= NF; i++)
        s += $i;
    print s
}' /etc/passwd

Sum whole file

awk -F: '
    {for (i = 1; i <= NF; i++)
        s += $i;
    };
    END{print s}
' /etc/passwd

while

awk 'BEGIN {
    while (a < 10) {
        print "- " " concatenation: " a
        a++;
    }
}'

do...while

awk '{
    i = 1
    do {
        print $0
        i++
    } while (i <= 5)
}' /etc/passwd

Break

awk 'BEGIN {
    break_num = 5
    for (i = 0; i < 10; i++) {
        print i
        if (i == break_num)
            break
    }
}'

Continue

awk 'BEGIN {
    for (x = 0; x <= 10; x++) {
        if (x == 5 || x == 6)
            continue
        printf "%d ", x
    }
    print ""
}'

AWK Formatted Printing

Usage

Right align

awk 'BEGIN{printf "|%10s|\n", "hello"}'

|     hello|

Left align

awk 'BEGIN{printf "|%-10s|\n", "hello"}'

|hello     |

Common specifiers

Character	Description
`c`	ASCII character
`d`	Decimal integer
`e`, `E`, `f`	Floating-point format
`o`	Unsigned octal value
`s`	String
`%`	Literal %

Space

awk -F: '{
    printf "%-10s %s\n", $1, $(NF-1)
}' /etc/passwd | head -n 3

Outputs

root       /root
bin        /bin
daemon     /sbin

awk -F: 'BEGIN {
    printf "%-10s %s\n", "User", "Home"
    printf "%-10s %s\n", "----","----"}
    { printf "%-10s %s\n", $1, $(NF-1) }
' /etc/passwd | head -n 5

Outputs

User       Home
----       ----
root       /root
bin        /bin
daemon     /sbin

Miscellaneous

Regex Metacharacters

Escape Sequences

-	-
`\b`	Backspace
`\f`	Form feed
`\n`	Newline (line feed)
`\r`	Carriage return
`\t`	Horizontal tab
`\v`	Vertical tab

Run script

$ cat demo.awk
#!/usr/bin/awk -f
BEGIN { x = 23 }
      { x += 2 }
END   { print x }
$ awk -f demo.awk /etc/passwd
69

Useful Links

GNU Awk User's Guide - Official documentation and comprehensive user guide for GNU Awk.
The AWK Programming Language - A digital copy of the book by Aho, Kernighan, and Weinberger, which is an authoritative resource on AWK.
Awk - A Tutorial and Introduction - TutorialsPoint guide offering a quick and practical introduction to AWK.
AWK Cheatsheet - Rafe's AWK Cheatsheet

Tech

The Basics

Have a try

Awk program

Example

Variables

Awk program examples

Conditions

Generate 100 spaces

Arrays

Functions

Awk Variables

Build-in variables

Expressions

Examples

Environment Variables

GNU awk only

Defining variable

Use shell variables

AWK Operators

Operators

Regular expression

More conditions

Operations

Arithmetic operations

Shorthand assignments

Comparison operators

Examples

Not match

if in array

Special Characters and Field Separators

Special Characters

Field Separators

Text Processing Examples

Log File Analysis

CSV Manipulation

Multiline Record Processing

AWK Functions

Common functions

User defined function

Error Handling

Checking for Missing Fields

Validating Data Formats

Error Logging

Awk Arrays

Array with index

Array with key

Array with split

Array with asort

Multi-dimensional

Multi-dimensional iteration

Awk Conditions

if-else statement

Ternary operator

Exists

Not exists

switch

Awk Loops

for...i

Powers of two between 1 and 100

for...in

Arguments

Examples

Reverse records

Reverse fields

Sum by record

Sum whole file

while

do...while

Break

Continue

AWK Formatted Printing

Usage

Right align

Left align

Common specifiers

Space

Header

Miscellaneous