Bash scripts are an easy way to automate functionality or turn a complex sequence of tasks into a single command. Command-line users have come to expect a lot of additional features from scripts running in the terminal, occasionally escaping the attention of script authors.
Ensuring dependencies
Almost all bash scripts require external tools to run properly, with many simply expecting the commands are available. When a needed command is not present, the script will run up to the point where the program is called and errors, leaving the process in a broken or unknown state. To prevent this, bash scripts should use the builtin command tool to check a needed program is available before starting the script
#!/bin/bash
if ! command -v "awk" &> /dev/null; then
echo "Error: 'awk' is not installed or not in the PATH."
exit 1
fi
This can be rewritten in a more condensed form too
command -v "awk" &> /dev/null || { echo "Error: 'awk' is not installed or not in the PATH."; exit 1; }
Since repeating the line for every required program is tedious, you can turn it into a function:
require(){
command -v "$1" &> /dev/null || { echo "Error: '$1' is not installed or not in the PATH."; exit 1; }
}
require "awk"
require "cal"
require "grep"
Keeping this at the top of the file not only aids readability (as it looks similar to import
or require
statements from other programming languages), but also ensures users are not caught off-guard by missing programs in the middle of running the script.
Handling flags
Bash scripts often need data or provide options that can be set through flags from the terminal. For programs offering flags, some conventions are expected by linux users:
- there are shorthand flags like
-x
and long variants like --expand
. All options should have a long variant, with some using shorthand flags as shortcuts for commonly used options - flags may or may not have values. flags without values like
-v
serve as boolean switches (exists = option on, else off). Flag options can be passed either as the next argument with space like--name bob
as a single parameter with an=
sign--name=bob.
Short flags may omit the whitespace like-n4
instead of-n 4
- short flags can be combined, for example
-xeu
instead of-x -e -u
- Positional arguments are allowed alongside flags, which may cause confusion. Given
-n bob -l
, it may not be clear if "bob" is the value of the flag -n, or a position argument. For clarity, the linux community has adopted the approach of separating positional arguments for flags with--
, for example-n -l -- bob
, making it much clearer which belong together.
Programs may of course choose to honor all, some or none of these conventions, but most linux software does, so users will expect them in many cases.
Most of these conventions can be adopted cheaply by using the getopt
program (not bash's builtin getopts
command!) to parse arguments:
ARGS=$(getopt -n "$0" -o n:,x -l name:,expand -- "$@")
if [[ $? -ne 0 ]]; then
echo "Error: Invalid options." >&2
exit 1
fi
eval set -- "$ARGS"
The syntax looks more difficult at first, but is quite simple: using -n
adjusts getopt
's error messages to the current script's name, -o
is followed by a list of all short flags and -l
takes all long flag names. Multiple flag names are separated by commas.
Flag names can end with no colon (e.g. "v
" meaning flag -v
has no value), one colon (e.g. "n:
" meaning flag -n
takes one value) or two colons (e.g. "n::
" meaning flag -n
may take a value. Ignore this option as it has special requirements, for example values can only be given with the --name=bob
syntax, without spaces.
Since getopt
takes care of argument parsing and validation, the if
condition exits with an error in case invalid or illegal flags were provided. This is important to help users catch typos, by printing an error for all unknown flags. It can be frustrating to run a command that looks correct, only to find out that one flag was quietly ignored because of a spelling error. Pointing invalid options out to users prevents these issues.
With the code above parsing the cli arguments, you can then loop through the newly set script arguments:
while true; do
case "$1" in
-n|--name)
name="$2"
shift 2
;;
-x|--expand)
export opt_expand=1
shift
;;
--)
shift
break
;;
esac
done
The first two cases simply handle the flags we defined above, with -n/--name
shifting input two places because it consumes the flag and the next parameter as it's value.
The third case --
handles the flag/positional argument separator, stopping the loop to process flags (leaving all other arguments in $@
unchanged).
Multiple values for flags
The above approach to flag parsing expects every flag to have at most one value. Some programs may need users to provide an unknown amount of values for a flag, which getopt
itself doesn't handle.
There are two common solutions to this issue: The first will let users provide flags multiple times, once for each value. This behavior is compatible with getopt
's parsing and used by many programs. Assuming a greeter program takes any number of names to greet, they could be specified as --name bob --name john --name jane
etc.
The only change needed to make this work is an adjustment of the parsing case, for example to add the names to an array internally:
names=()
while true; do
case "$1" in
-n|--name)
names+=("$2")
shift 2
;;
--)
shift
break
;;
-*)
echo "Unknown option: $1" >&2
exit 1
;;
esac
done
The only changes to the previous version is the globally defined array names
to hold all provided names, and the handling for the -n/--name
flag appending all values to this array instead of replacing prior values.
Specifying flags multiple times is common throughout the linux ecosystem, for example to set the degree of verbosity with the -v
flag, it is common to allow setting it multiple times for increased verbosity, like -vvv
for triple verbose output.
The alternative approach to the multliple value flag problem is to bypass getopt
's parsing and instead have users provide the values with some kind of delimiter as a single value, for example --names bob.john,jane
. While this requires more manual processing to split the values into an array during parsing, it can be more versatile in some situations.
Help output
Almost any tool running on the linux terminal is expected to provide help output to describe itself, intended usage and possible options. This output is expected to be available by passing the -h
or --help
flag, with both seeing frequent usage, so user-friendly scripts should provide both for completeness.
Help messages in bash are usually just a function printing lines of text using echo
, with manual formatting within the strings.
print_help() {
echo "Usage: $0 [-v] [-n NAME]"
echo
echo "Greets named users or the world"
echo
echo "Options:"
echo " -h, --help Show this help message and exit."
echo " -n, --name NAME Specify a name."
echo " -v, --verbose Enable verbose mode."
echo " --version Show script version."
echo
echo "Examples:"
echo " $0 --name Alice Run the script with the name Alice."
echo " $0 -v Run the script in verbose mode."
}
There is no predefined format for help output, but most will start with a usage definition, followed by a brief description of what the program does, supported flags and options, and some ending with example usage.
Note the use of $0
instead of a hard-coded program name. Using a variable here ensures that the program is always named correctly, even when the user has renamed the script (avoiding confusion).
Error Cleanup
Programs sometimes encounter errors, and bash scripts are no exception to this rule. When a bash script encounters an error, it will show the stderr output of the command, but continues to execute the remaining commands regardless.
Keeping a program running after an error occurred will often result in broken states, where the output of the program is not what was expected, or tasks were only partially executed.
Let's look at this problem with a sample script
#!/bin/bash
tempdir=$(mktemp -d)
find . -maxdepth 1 -type f -name "*.log" -exec cp {} "$tempdir" \;
zip -j "$tempdir/logs.zip" "$tempdir/*.log"
mv "$tempdir/logs.zip" "$HOME/logs/"
rm -r "$tempdir"
The script is very simple: It creates a temporary directory, copies all files ending in ".log" into it, then zips them up, moves the zipped log archive to the user's home directory, and finally deletes the temporary directory.
Assuming there are no .log
files in the current directory, the zip
command will fail and not produce an output file, which in turn makes the mv
command fail as there is nothing to move.
In order to control this error situation, we can set the -e
shell option to make the program stop as soon as a command fails:
#!/bin/bash
set -e
tempdir=$(mktemp -d)
find . -maxdepth 1 -type f -name "*.log" -exec cp {} "$tempdir" \;
zip -j "$tempdir/logs.zip" "$tempdir/*.log"
mv "$tempdir/logs.zip" "$HOME/logs/"
rm -r "$tempdir"
Now the zip
command fails and the program stops right then and there, introducing a new problem: the temporary directory is not removed anymore in case of errors.
Deferring cleanup tasks like this can be done with the trap
command, catching the special EXIT
argument instead of a process signal to run when the script ends (successfully or due to errors):
#!/bin/bash
set -e
tempdir=$(mktemp -d)
trap 'rm -r "$tempdir"' EXIT
find . -maxdepth 1 -type f -name "*.log" -exec cp {} "$tempdir" \;
zip -j "$tempdir/logs.zip" "$tempdir/*.log"
mv "$tempdir/logs.zip" "$HOME/logs/"
Now the temporary directory is always removed when the scripts exits, no matter in which state it stopped. Deferring cleanup like this has the added benefit of moving the cleanup command right beneath the line that created the resource (the mktemp
call in our example), and is important for any script that creates temporary resources, like files, directories or locks.