Writing user-friendly bash scripts

Bash scripts are an easy way to automate functionality or turn a complex sequence of tasks into a single command. Command-line users have come to expect a lot of additional features from scripts running in the terminal, occasionally escaping the attention of script authors.

Ensuring dependencies

Almost all bash scripts require external tools to run properly, with many simply expecting the commands are available. When a needed command is not present, the script will run up to the point where the program is called and fails there, leaving the process in a broken or unknown state. To prevent this, bash scripts should use the builtin command tool to check a needed program is available before starting the script

#!/bin/bash

if ! command -v "awk" &> /dev/null; then
   echo "Error: 'awk' is not installed or not in the PATH."
   exit 1
fi

This can be rewritten in a more condensed form too

command -v "awk" &> /dev/null || { echo "Error: 'awk' is not installed or not in the PATH."; exit 1; }

Since repeating the line for every required program is tedious, you can turn it into a function:

require(){
  command -v "$1" &> /dev/null || { echo "Error: '$1' is not installed or not in the PATH."; exit 1; }
}

require "awk"
require "cal"
require "grep"

Keeping this at the top of the file not only aids readability (as it looks similar to import or require statements from other programming languages), but also ensures users are not caught off-guard by missing programs in the middle of running the script.

Handling flags

Bash scripts often need data or provide options that can be set through flags from the terminal. For programs offering flags, some conventions are expected by linux users:

there are shorthand flags like -x and long variants like --expand. All options should have a long variant, with some using shorthand flags as shortcuts for commonly used options
flags may or may not have values. flags without values like -v serve as boolean switches (exists = option on, else off). Flag options can be passed either as the next argument with a space inbetween like --name bob or as a single parameter with an = sign --name=bob. Short flags may omit the whitespace like -n4 instead of -n 4
short flags can be combined, for example -xeu instead of -x -e -u
Positional arguments are allowed alongside flags, which may cause confusion. Given -n bob -l, it may not be clear if bob is the value of the flag -n, or a positional argument. For clarity, the linux community has adopted the approach of separating positional arguments for flags with --, for example -n -l -- bob, making it much clearer which belong together.

Programs may of course choose to honor all, some or none of these conventions, but most linux software does, so users will expect them in many cases.

Most of these conventions can be adopted cheaply by using the getopt program (not bash's builtin getopts command!) to parse arguments:

ARGS=$(getopt -n "$0" -o n:,x -l name:,expand -- "$@")
if [[ $? -ne 0 ]]; then 
   echo "Error: Invalid options." >&2
   exit 1
fi
eval set -- "$ARGS"

The syntax looks complex at first, but is quite simple: using -n adjusts getopt's error messages to the current script's name, -o is followed by a list of all short flags and -l takes all long flag names. Multiple flag names are separated by commas.

Flag names can end with no colon (e.g. "v" meaning flag -v has no value), one colon (e.g. "n:" meaning flag -n takes one value) or two colons (e.g. "n::" meaning flag -n may take an optional value). Ignore the last option if you can as it has special requirements, for example values can only be given with the --name=bob syntax, without spaces.

Since getopt takes care of argument parsing and validation, the if condition exits with an error in case invalid or illegal flags were provided. This is important to help users catch typos, by printing an error for all unknown flags. It can be frustrating to run a command that looks correct, only to find out that one flag was quietly ignored because of a spelling error. Pointing out invalid options to users prevents these issues.

With the code above parsing the cli arguments, you can then loop through the newly set script arguments:

while true; do
   case "$1" in
       -n|--name)
           name="$2"
           shift 2
           ;;
       -x|--expand)
           export opt_expand=1
           shift
           ;;
       --)
           shift
           break
           ;;
   esac
done

The first two cases simply handle the flags we defined above, with -n/--name shifting input two places because it consumes the flag and the next parameter as it's value.

The third case -- handles the flag/positional argument separator, stopping the loop to process flags (leaving all other arguments in $@ unchanged).

Multiple values for flags

The above approach to flag parsing expects every flag to have at most one value. Some programs may need users to provide an unknown amount of values for a flag, which getopt itself doesn't handle.

There are two common solutions to this issue: The first will let users provide flags multiple times, once for each value. This behavior is compatible with getopt's parsing and used by many programs. Assuming a greeter program takes any number of names to greet, they could be specified as --name bob --name john --name jane etc.

The only change needed to make this work is an adjustment of the parsing case, for example to add the names to an array internally:

names=()

while true; do
   case "$1" in
       -n|--name)
           names+=("$2")
           shift 2
           ;;  
       --)
           shift
           break
           ;;
       -*)
           echo "Unknown option: $1" >&2
           exit 1
           ;;
   esac
done

The only changes to the previous version is the globally defined array names to hold all provided names, and the handling for the -n/--name flag appending all values to this array instead of replacing prior values.

Specifying flags multiple times is common throughout the linux ecosystem, for example to set the degree of verbosity with the -v flag, it is common to allow setting it multiple times for increased verbosity, like -vvv for triple verbose output.

The alternative approach to the multliple value flag problem is to bypass getopt's parsing and instead have users provide the values with some kind of delimiter as a single value, for example --names bob.john,jane. While this requires more manual processing to split the values into an array during parsing, it can be more versatile in some situations.

Help output

Almost any tool running on the linux terminal is expected to provide help output to describe itself, intended usage and possible options. This output is expected to be available by passing the -h or --help flag, with both seeing frequent usage, so user-friendly scripts should provide both for completeness.

Help messages in bash are usually just a function printing lines of text using echo, with manual formatting within the strings.

print_help() {
   echo "Usage: $0 [-v] [-n NAME]" 
   echo 
   echo "Greets named users or the world"
   echo
   echo "Options:"
   echo " -h, --help        Show this help message and exit."
   echo " -n, --name NAME   Specify a name."
   echo " -v, --verbose     Enable verbose mode."
   echo " --version         Show script version."
   echo
   echo "Examples:"
   echo " $0 --name Alice   Run the script with the name Alice."
   echo " $0 -v             Run the script in verbose mode."
}

There is no predefined format for help output, but most will start with a usage definition, followed by a brief description of what the program does, supported flags and options, and some ending with example usage.

Note the use of $0 instead of a hard-coded program name. Using a variable here ensures that the program is always named correctly, even when the user has renamed the script (avoiding confusion).

Error Cleanup

Programs sometimes encounter errors, and bash scripts are no exception to this rule. When a bash script encounters an error, it will show the stderr output of the command, but continues to execute the remaining commands regardless.

Keeping a program running after an error occurred will often result in broken states, where the output of the program is not what was expected, or tasks were only partially executed.

Let's look at this problem with a sample script

#!/bin/bash

tempdir=$(mktemp -d)
find . -maxdepth 1 -type f -name "*.log" -exec cp {} "$tempdir" \;
zip -j "$tempdir/logs.zip" "$tempdir/*.log"
mv "$tempdir/logs.zip" "$HOME/logs/"
rm -r "$tempdir"

The script is very simple: It creates a temporary directory, copies all files ending in ".log" into it, then zips them up, moves the zipped log archive to the user's home directory, and finally deletes the temporary directory.

Assuming there are no .log files in the current directory, the zip command will fail and not produce an output file, which in turn makes the mv command fail as there is nothing to move.

In order to control this error situation, we can set the -e shell option to make the program stop as soon as a command fails:

#!/bin/bash
set -e

tempdir=$(mktemp -d)
find . -maxdepth 1 -type f -name "*.log" -exec cp {} "$tempdir" \;
zip -j "$tempdir/logs.zip" "$tempdir/*.log"
mv "$tempdir/logs.zip" "$HOME/logs/"
rm -r "$tempdir"

Now the zip command fails and the program stops right then and there, introducing a new problem: the temporary directory is not removed anymore in case of errors.

Deferring cleanup tasks like this can be done with the trap command, catching the special EXIT argument instead of a process signal to run when the script ends (successfully or due to errors):

#!/bin/bash
set -e

tempdir=$(mktemp -d)
trap 'rm -r "$tempdir"' EXIT

find . -maxdepth 1 -type f -name "*.log" -exec cp {} "$tempdir" \;
zip -j "$tempdir/logs.zip" "$tempdir/*.log"
mv "$tempdir/logs.zip" "$HOME/logs/"

Now the temporary directory is always removed when the scripts exits, no matter in which state it stopped. Deferring cleanup like this has the added benefit of moving the cleanup command right beneath the line that created the resource (the mktemp call in our example), and is important for any script that creates temporary resources, like files, directories or locks.