Exit Codes
Exit codes are a number between 0 and 256, which is returned by any Unix command when it returns
control to its parent process.
Other numbers can be used, but these are treated modulo 256, so
exit -10
is
equivalent to
exit 246
, and
exit 257
is equivalent to
exit 1
.
These can be used within a shell script to change the flow of execution depending on the success
or failure of commands executed. This was briefly introduced in
Variables
- Part II. Here we shall look in more detail in the available interpretations of exit codes.
Success is traditionally represented with
exit 0
;
failure is normally
indicated with a non-zero exit-code. This value can indicate different reasons for failure.
For example, GNU
grep
returns
0
on success,
1
if no
matches were found, and
2
for other errors (syntax errors, nonexistant input
files, etc).
We shall look at three different methods for checking error status, and discuss the pros and
cons of each approach.
Firstly, the simple approach:
#!/bin/sh
# First attempt at checking return codes
USERNAME=`grep "^${1}:" /etc/passwd|cut -d":" -f1`
if [ "$?" -ne "0" ]; then
echo "Sorry, cannot find user ${1} in /etc/passwd"
exit 1
fi
NAME=`grep "^${1}:" /etc/passwd|cut -d":" -f5`
HOMEDIR=`grep "^${1}:" /etc/passwd|cut -d":" -f6`
echo "USERNAME: $USERNAME"
echo "NAME: $NAME"
echo "HOMEDIR: $HOMEDIR"
This script works fine if you supply a valid username in
/etc/passwd
.
However, if you enter an invalid code, it does not do what you might at first
expect - it keeps running, and just shows:
USERNAME:
NAME:
HOMEDIR:
Why is this? As mentioned, the
$?
variable is set to the return code
of the
last executed command. In this case, that is
cut
.
cut
had no problems which it feels like reporting - as far as I can tell from testing it, and
reading the documentation,
cut
returns zero whatever happens! It was fed an
empty string, and did its job - returned the first field of its input, which just happened
to be the empty string.
So what do we do? If we have an error here,
grep
will report it, not
cut
.
Therefore, we have to test
grep
's return code, not
cut
's.
#!/bin/sh
# Second attempt at checking return codes
grep "^${1}:" /etc/passwd > /dev/null 2>&1
if [ "$?" -ne "0" ]; then
echo "Sorry, cannot find user ${1} in /etc/passwd"
exit 1
fi
USERNAME=`grep "^${1}:" /etc/passwd|cut -d":" -f1`
NAME=`grep "^${1}:" /etc/passwd|cut -d":" -f5`
HOMEDIR=`grep "^${1}:" /etc/passwd|cut -d":" -f6`
echo "USERNAME: $USERNAME"
echo "NAME: $NAME"
echo "HOMEDIR: $HOMEDIR"
This fixes the problem for us, though at the expense of slightly longer code.
That is the basic way which textbooks might show you, but it is far from being
all there is to know about error-checking in shell scripts. This method may not
be the most suitable to your particular command-sequence, or may be unmaintainable. Below, we shall
investigate two alternative approaches.
As a second approach, we can tidy this somewhat by putting the test into a separate function,
instead of littering the code with lots of 4-line tests:
#!/bin/sh
# A Tidier approach
check_errs()
{
# Function. Parameter 1 is the return code
# Para. 2 is text to display on failure.
if [ "${1}" -ne "0" ]; then
echo "ERROR # ${1} : ${2}"
# as a bonus, make our script exit with the right error code.
exit ${1}
fi
}
### main script starts here ###
grep "^${1}:" /etc/passwd > /dev/null 2>&1
check_errs $? "User ${1} not found in /etc/passwd"
USERNAME=`grep "^${1}:" /etc/passwd|cut -d":" -f1`
check_errs $? "Cut returned an error"
echo "USERNAME: $USERNAME"
check_errs $? "echo returned an error - very strange!"
This allows us to test for errors 3 times, with customised error messages,
without having to write 3 individual tests. By writing the test routine once.
we can call it as many times as we wish, creating a more intelligent script, at
very little expense to the programmer. Perl programmers will recognise this
as being similar to the
die
command in Perl.
As a third approach, we shall look at a simpler and cruder method. I tend to
use this for building Linux kernels - simple automations which, if they go well,
should just get on with it, but when things go wrong, tend to require the operator
to do something intelligent (ie, that which a script cannot do!):
#!/bin/sh
cd /usr/src/linux && \
make dep && make bzImage && make modules && make modules_install && \
cp arch/i386/boot/bzImage /boot/my-new-kernel && cp System.map /boot && \
echo "Your new kernel awaits, m'lord."
This script runs through the various tasks involved in building a Linux
kernel (which can take quite a while), and uses the
&&
operator to check for success. To do this with
if
would involve:
#!/bin/sh
cd /usr/src/linuxif [ "$?" -eq "0" ]; then
make dep
if [ "$?" -eq "0" ]; then
make bzImage
if [ "$?" -eq "0" ]; then
make modules
if [ "$?" -eq "0" ]; then
make modules_install
if [ "$?" -eq "0" ]; then
cp arch/i386/boot/bzImage /boot/my-new-kernel
if [ "$?" -eq "0" ]; then
cp System.map /boot/
if [ "$?" -eq "0" ]; then
echo "Your new kernel awaits, m'lord."
fi
fi
fi
fi
fi
fi
fi
fi
... which I, personally, find pretty difficult to follow.
The
&&
and
||
operators are the shell's equivalent of AND and OR
tests. These can be thrown together in strings, as above, or:
#!/bin/sh
cp /foo /bar && ( echo Success ; echo Success part II ) || ( echo Failed ; echo Failed part II )
This code will either echo
Success
Success part II
or
Failed
Failed part II
depending on whether or not the
cp
command was succesful. Look carefully at this;
the construct is
command && command-to-execute-on-success || command-to-execute-on-failure
Only one command can be in each part, though the
( )
brackets make a subshell, which
is treated as a single command by the top-level shell.
This method is handy for simple success / fail scenarios, but if you want to
check on the status of the
echo
commands themselves, it is easy to quickly
become confused about which
&&
and
||
applies to which
command. It is also very difficult to maintain. Therefore this construct is only
recommended for simple sequencing of commands.
Source: http://steve-parker.org/sh/exitcodes.shtml