Bash error codes - Исправление ошибок и поиск оптимальных решений проблем

Инструменты автоматизации и мониторинга удобны тем, что разработчик может взять готовые скрипты, при необходимости адаптировать и использовать в своём проекте. Можно заметить, что в некоторых скриптах используются коды завершения (exit codes), а в других нет. О коде завершения легко забыть, но это очень полезный инструмент. Особенно важно использовать его в скриптах командной строки.

Что такое коды завершения

В Linux и других Unix-подобных операционных системах программы во время завершения могут передавать значение родительскому процессу. Это значение называется кодом завершения или состоянием завершения. В POSIX по соглашению действует стандарт: программа передаёт 0 при успешном исполнении и 1 или большее число при неудачном исполнении.

Почему это важно? Если смотреть на коды завершения в контексте скриптов для командной строки, ответ очевиден. Любой полезный Bash-скрипт неизбежно будет использоваться в других скриптах или его обернут в однострочник Bash. Это особенно актуально при использовании инструментов автоматизации типа SaltStack или инструментов мониторинга типа Nagios. Эти программы исполняют скрипт и проверяют статус завершения, чтобы определить, было ли исполнение успешным.

Кроме того, даже если вы не определяете коды завершения, они всё равно есть в ваших скриптах. Но без корректного определения кодов выхода можно столкнуться с проблемами: ложными сообщениями об успешном исполнении, которые могут повлиять на работу скрипта.

Что происходит, когда коды завершения не определены

В Linux любой код, запущенный в командной строке, имеет код завершения. Если код завершения не определён, Bash-скрипты используют код выхода последней запущенной команды. Чтобы лучше понять суть, обратите внимание на пример.

#!/bin/bash
touch /root/test
echo created file

Этот скрипт запускает команды touch и echo. Если запустить этот скрипт без прав суперпользователя, команда touch не выполнится. В этот момент мы хотели бы получить информацию об ошибке с помощью соответствующего кода завершения. Чтобы проверить код выхода, достаточно ввести в командную строку специальную переменную $?. Она печатает код возврата последней запущенной команды.

$ ./tmp.sh 
touch: cannot touch '/root/test': Permission denied
created file
$ echo $?
0

Как видно, после запуска команды ./tmp.sh получаем код завершения 0. Этот код говорит об успешном выполнении команды, хотя на самом деле команда не выполнилась. Скрипт из примера выше исполняет две команды: touch и echo. Поскольку код завершения не определён, получаем код выхода последней запущенной команды. Это команда echo, которая успешно выполнилась.

Скрипт:

#!/bin/bash
touch /root/test

Если убрать из скрипта команду echo, можно получить код завершения команды touch.

$ ./tmp.sh 
touch: cannot touch '/root/test': Permission denied
$ echo $?
1

Поскольку touch в данном случае — последняя запущенная команда, и она не выполнилась, получаем код возврата 1.

Как использовать коды завершения в Bash-скриптах

Удаление из скрипта команды echo позволило нам получить код завершения. Что делать, если нужно сделать разные действия в случае успешного и неуспешного выполнения команды touch? Речь идёт о печати stdout в случае успеха и stderr в случае неуспеха.

Проверяем коды завершения

Выше мы пользовались специальной переменной $?, чтобы получить код завершения скрипта. Также с помощью этой переменной можно проверить, выполнилась ли команда touch успешно.

#!/bin/bash
touch /root/test 2> /dev/null
if [ $? -eq 0 ]
then
  echo "Successfully created file"
else
  echo "Could not create file" >&2
fi

После рефакторинга скрипта получаем такое поведение:

Если команда touch выполняется с кодом 0, скрипт с помощью echo сообщает об успешно созданном файле.
Если команда touch выполняется с другим кодом, скрипт сообщает, что не смог создать файл.

Любой код завершения кроме 0 значит неудачную попытку создать файл. Скрипт с помощью echo отправляет сообщение о неудаче в stderr.

Выполнение:

$ ./tmp.sh
Could not create file

Создаём собственный код завершения

Наш скрипт уже сообщает об ошибке, если команда touch выполняется с ошибкой. Но в случае успешного выполнения команды мы всё также получаем код 0.

$ ./tmp.sh
Could not create file
$ echo $?
0

Поскольку скрипт завершился с ошибкой, было бы не очень хорошей идеей передавать код успешного завершения в другую программу, которая использует этот скрипт. Чтобы добавить собственный код завершения, можно воспользоваться командой exit.

#!/bin/bash
touch /root/test 2> /dev/null
if [ $? -eq 0 ]
then
  echo "Successfully created file"
  exit 0
else
  echo "Could not create file" >&2
  exit 1
fi

Теперь в случае успешного выполнения команды touch скрипт с помощью echo сообщает об успехе и завершается с кодом 0. В противном случае скрипт печатает сообщение об ошибке при попытке создать файл и завершается с кодом 1.

Выполнение:

$ ./tmp.sh
Could not create file
$ echo $?
1

Как использовать коды завершения в командной строке

Скрипт уже умеет сообщать пользователям и программам об успешном или неуспешном выполнении. Теперь его можно использовать с другими инструментами администрирования или однострочниками командной строки.

Bash-однострочник:

$ ./tmp.sh && echo "bam" || (sudo ./tmp.sh && echo "bam" || echo "fail")
Could not create file
Successfully created file
bam

В примере выше && используется для обозначения «и», а || для обозначения «или». В данном случае команда выполняет скрипт ./tmp.sh, а затем выполняет echo "bam", если код завершения 0. Если код завершения 1, выполняется следующая команда в круглых скобках. Как видно, в скобках для группировки команд снова используются && и ||.

Скрипт использует коды завершения, чтобы понять, была ли команда успешно выполнена. Если коды завершения используются некорректно, пользователь скрипта может получить неожиданные результаты при неудачном выполнении команды.

Дополнительные коды завершения

Команда exit принимает числа от 0 до 255. В большинстве случаев можно обойтись кодами 0 и 1. Однако есть зарезервированные коды, которые обозначают конкретные ошибки. Список зарезервированных кодов можно посмотреть в документации.

Адаптированный перевод статьи Understanding Exit Codes and how to use them in bash scripts by Benjamin Cane. Мнение администрации Хекслета может не совпадать с мнением автора оригинальной публикации.

Источник

To a first approximation, 0 is success, non-zero is failure, with 1 being general failure, and anything larger than one being a specific failure. Aside from the trivial exceptions of false and test, which are both designed to give 1 for success, there’s a few other exceptions I found.

More realistically, 0 means success or maybe failure, 1 means general failure or maybe success, 2 means general failure if 1 and 0 are both used for success, but maybe success as well.

The diff command gives 0 if files compared are identical, 1 if they differ, and 2 if binaries are different. 2 also means failure. The less command gives 1 for failure unless you fail to supply an argument, in which case, it exits 0 despite failing.

The more command and the spell command give 1 for failure, unless the failure is a result of permission denied, nonexistent file, or attempt to read a directory. In any of these cases, they exit 0 despite failing.

Then the expr command gives 1 for success unless the output is the empty string or zero, in which case, 0 is success. 2 and 3 are failure.

Then there’s cases where success or failure is ambiguous. When grep fails to find a pattern, it exits 1, but it exits 2 for a genuine failure (like permission denied). klist also exits 1 when it fails to find a ticket, although this isn’t really any more of a failure than when grep doesn’t find a pattern, or when you ls an empty directory.

So, unfortunately, the Unix powers that be don’t seem to enforce any logical set of rules, even on very commonly used executables.

Источник

The exit status is a numeric value that is returned by a program to the calling program or shell. In C programs, this is represented by the return value of the main() function or the value you give to exit(3). The only part of the number that matters are the least significant 8 bits, which means there are only values from 0 to 255.

In the shell, every operation generates an exit status (return status), even if no program is called. An example for such an operation is a redirection.

The parameter to the

builtin commands serve the purpose of giving the exit status to the calling component.

This — and only this — makes it possible to determinate the success or failure of an operation. For scripting, always set exit codes.

The code is a number between 0 and 255, where the part from 126 to 255 is reserved to be used by the Bash shell directly or for special purposes, like reporting a termination by a signal:

Code	Description
0	success
1-255	failure (in general)
126	the requested command (file) can’t be executed (but was found)
127	command (file) not found
128	according to ABS it’s used to report an invalid argument to the exit builtin, but I wasn’t able to verify that in the source code of Bash (see code 255)
128 + N	the shell was terminated by the signal N (also used like this by various other programs)
255	wrong argument to the exit builtin (see code 128)

The lower codes 0 to 125 are not reserved and may be used for whatever the program likes to report. A value of 0 means successful termination, a value not 0 means unsuccessful termination. This behavior (== 0, != 0) is also what Bash reacts on in some code flow control statements like if or while.

Tables of shell behavior involving non-portable side-effects or common bugs with exit statuses. Note heirloom doesn’t support pipeline negation (! pipeline).

test	bash 4.2.45	bash (POSIX)	zsh 5.0.2 (emulate ksh)	ksh93 93v- 2013-03-18	mksh R44 2013/02/24	posh 0.11	dash 0.5.7.3	busybox 1.2.1	heirloom 050706
:; : `false` `echo $? >&2`	1	1	1	1	0	0	0	0	1
false; eval; echo $?	0	0	0	0	0	1	0	1	0
x=`false` eval echo $?	1	1	1	1	0	0	0	0	1
eval echo $? <&0`false`	1	1	1	1	0	0	0	0	1
while :; do ! break; done; echo $?	1	1	1	1	0	0	1	1	—
discussion false; : \| echo $?	1	1	1	0	1	1	1	1	0
(exit 2); for x in "`exit 3`"; do echo $?; done	3	3	3	3	2	2	0	0	3

Measuring side-effects during the function call, during return, and transparency of the return builtin.

test	bash	bash (POSIX)	zsh (emulate ksh)	ksh93	mksh	posh	dash	busybox	heirloom
f() { echo $?; }; :; f `false`	1	1	1	1	0	0	0	0	1
f() { return; }; false; f; echo $?	1	1	1	0	1	1	1	1	1
f() { return $?; }; false; f; echo $?	1	1	1	1	1	1	1	1	1
f() { ! return; }; f; echo $?	0	0	1	0	0	0	1	1	—
f() { ! return; }; false; f; echo $?	1	1	0	0	1	1	0	0	—
f() { return; }; x=`false` f; echo $?	1	1	1	1	0	0	0	0	0
f() { return; }; f <&0`false`; echo $?	1	1	1	1	0	0	0	0	1
f() { x=`false` return; }; f; echo $?	1	1	1	0	0	0	0	0	1
f() { return <&0`false`; }; f; echo $?	1	1	1	0	0	0	0	0	1
f() { x=`false` return <&0`false`; }; f; echo $?	1	1	1	1	0	0	0	0	1

Statuses measured within the command and after, with matching and non-matching patterns.

test	bash	bash (POSIX)	zsh (emulate ksh)	ksh93	mksh	posh	heirloom
(exit 2); case x in x) echo $?;; esac	2	2	0	2	2	2	2
(exit 2); case `exit 3`x in x) echo $?;; esac	3	3	0	3	2	2	3
(exit 2); case x in `exit 4`x) echo $?;; esac	4	4	4	4	2	2	4
(exit 2); case `exit 3`x in `exit 4`x) echo $?;; esac	4	4	4	4	2	2	4
(exit 2); case x in x);; esac; echo $?	0	0	0	0	0	0	2
(exit 2); case x in "");; esac; echo $?	0	0	0	0	0	0	2
(exit 2); case `exit 3`x in x);; esac; echo $?	0	0	0	3	0	0	3
(exit 2); case `exit 3`x in "");; esac; echo $?	0	0	0	3	0	0	3
(exit 2); case x in `exit 4`x);; esac; echo $?	0	0	0	4	0	0	4
(exit 2); case x in `exit 4`);; esac; echo $?	0	0	4	4	0	0	4
(exit 2); case `exit 3`x in `exit 4`);; esac; echo $?	0	0	4	4	0	0	4
(exit 2); case `exit 3`x in `exit 4`x);; esac; echo $?	0	0	0	4	0	0	4

Источник

When you execute a command or run a script, you receive an exit code. An exit code is a system response that reports success, an error, or another condition that provides a clue about what caused an unexpected result from your command or script. Yet, you might never know about the code, because an exit code doesn’t reveal itself unless someone asks it to do so. Programmers use exit codes to help debug their code.

Note: You’ll often see exit code referred to as exit status or even as exit status codes. The terms are used interchangeably except in documentation. Hopefully my use of the two terms will be clear to you.

I’m not a programmer. It’s hard for me to admit that, but it’s true. I’ve studied BASIC, FORTRAN, and a few other languages both formally and informally, and I have to say that I am definitely not a programmer. Oh sure, I can script and program a little in PHP, Perl, Bash, and even PowerShell (yes, I’m also a Windows administrator), but I could never make a living at programming because I’m too slow at writing code and trial-and-error isn’t an efficient debugging strategy. It’s sad, really, but I’m competent enough at copying and adapting found code that I can accomplish my required tasks. And yet, I also use exit codes to figure out where my problems are and why things are going wrong.

[ You might also like: A little SSH file copy magic at the command line. ]

Exit codes are useful to an extent, but they can also be vague. For example, an exit code of 1 is a generic bucket for miscellaneous errors and isn’t helpful at all. In this article, I explain the handful of reserved error codes, how they can occur, and how to use them to figure out what your problem is. A reserved error code is one that’s used by Bash and you shouldn’t create your own error codes that conflict with them.

Enough backstory. It’s time to look at examples of what generates error codes/statuses.

Extracting the elusive exit code

To display the exit code for the last command you ran on the command line, use the following command:

$ echo $?

The displayed response contains no pomp or circumstance. It’s simply a number. You might also receive a shell error message from Bash further describing the error, but the exit code and the shell error together should help you discover what went wrong.

Exit status 0

An exit status of 0 is the best possible scenario, generally speaking. It tells you that your latest command or script executed successfully. Success is relative because the exit code only informs you that the script or command executed fine, but the exit code doesn’t tell you whether the information from it has any value. Examples will better illustrate what I’m describing.

For one example, list files in your home directory. I have nothing in my home directory other than hidden files and directories, so nothing to see here, but the exit code doesn’t care about anything but the success of the command’s execution:

$ ls
$ echo $?
0

The exit code of 0 means that the ls command ran without issue. Although, again, the information from exit code provides no real value to me.

Now, execute the ls command on the /etc directory and then display the exit code:

$ ls /etc
**Many files**
$ echo $?
0

You can see that any successful execution results in an exit code of 0, including something that’s totally wrong, such as issuing the cat command on a binary executable file like the ls command:

$ cat /usr/bin/ls
**A lot of screen gibberish and beeping**
$ echo $?
0

Exit status 1

Using the above example but adding in the long listing and recursive options (-lR), you receive a new exit code of 1:

$ ls -lR /etc
**A lengthy list of files**
$ echo $?
1

Although the command’s output looks as though everything went well, if you scroll up you will see several «Permission denied» errors in the listing. These errors result in an exit status of 1, which is described as «impermissible operations.» Although you might expect that a «Permission denied» error leads to an exit status of 1, you’d be wrong, as you will see in the next section.

Dividing by zero, however, gives you an exit status of 1. You also receive an error from the shell to let you know that the operation you’re performing is «impermissible:»

$ let a=1
$ let b=0
$ let c=a/b
-bash: let: c=a/b: division by 0 (error token is "b")
$ echo $?
1

Without a shell error, an exit status of 1 isn’t very helpful, as you can see from the first example. In the second example, you know why you received the error because Bash tells you with a shell error message. In general, when you receive an exit status of 1, look for the impermissible operations (Permission denied messages) mixed with your successes (such as listing all the files under /etc, as in the first example in this section).

Exit status 2

As stated above, a shell warning of «Permission denied» results in an exit status of 2 rather than 1. To prove this to yourself, try listing files in /root:

$ ls /root
ls: cannot open directory '/root': Permission denied
$ echo $?
2

Exit status 2 appears when there’s a permissions problem or a missing keyword in a command or script. A missing keyword example is forgetting to add a done in a script’s do loop. The best method for script debugging with this exit status is to issue your command in an interactive shell to see the errors you receive. This method generally reveals where the problem is.

Permissions problems are a little less difficult to decipher and debug than other types of problems. The only thing «Permission denied» means is that your command or script is attempting to violate a permission restriction.

Exit status 126

Exit status 126 is an interesting permissions error code. The easiest way to demonstrate when this code appears is to create a script file and forget to give that file execute permission. Here’s the result:

$ ./blah.sh
-bash: ./blah.sh: Permission denied
$ echo $?
126

This permission problem is not one of access, but one of setting, as in mode. To get rid of this error and receive an exit status of 0 instead, issue chmod +x blah.sh.

Note: You will receive an exit status of 0 even if the executable file has no contents. As stated earlier, «success» is open to interpretation.

I receiving an exit status of 126. This code actually tells me what’s wrong, unlike the more vague codes.

Exit status 127

Exit status 127 tells you that one of two things has happened: Either the command doesn’t exist, or the command isn’t in your path ($PATH). This code also appears if you attempt to execute a command that is in your current working directory. For example, the script above that you gave execute permission is in your current directory, but you attempt to run the script without specifying where it is:

$ blah.sh
-bash: blah.sh: command not found
$ echo $?
127

If this result occurs in a script, try adding the explicit path to the problem executable or script. Spelling also counts when specifying an executable or a script.

[ Readers also enjoyed: 10 basic Linux commands you need to know. ]

Exit status 128

Exit status 128 is the response received when an out-of-range exit code is used in programming. From my experience, exit status 128 is not possible to produce. I have tried multiple actions for an example, and I can’t make it happen. However, I can produce an exit status 128-adjacent code. If your exit code exceeds 256, the exit status returned is your exit code subtracted by 256. This result sounds odd and can actually create an incorrect exit status. Check the examples to see for yourself.

Using an exit code of 261, create an exit status of 5:

$ bash
$ exit 261
exit
$ echo $?
5

To produce an errant exit status of 0:

$ bash
$ exit 256
exit
$ echo $?
0

If you use 257 as the exit code, your exit status is 1, and so on. If the exit code is a negative number, the resulting exit status is that number subtracted from 256. So, if your exit code is 20, then the exit status is 236.

Troubling, isn’t it? The solution, to me, is to avoid using exit codes that are reserved and out-of-range. The proper range is 0-255.

Exit status 130

If you’re running a program or script and press Ctrl-C to stop it, your exit status is 130. This status is easy to demonstrate. Issue ls -lR / and then immediately press Ctrl-C:

$ ls -lR /
**Lots of files scrolling by**
^C
$ echo $?
130

There’s not much else to say about this one. It’s not a very useful exit status, but here it is for your reference.

Exit status 255

This final reserved exit status is easy to produce but difficult to interpret. The documentation that I’ve found states that you receive exit status 255 if you use an exit code that’s out of the range 0-255.

I’ve found that this status can also be produced in other ways. Here’s one example:

$ ip
**Usage info for the ip command**
  
$ echo $?
255

Some independent authorities say that 255 is a general failure error code. I can neither confirm nor deny that statement. I only know that I can produce exit status 255 by running particular commands with no options.

[ Related article: 10 more essential Linux commands you need to know. ]

Wrapping up

There you have it: An overview of the reserved exit status numbers, their meanings, and how to generate them. My personal advice is to always check permissions and paths for anything you run, especially in a script, instead of relying on exit status codes. My debug method is that when a command doesn’t work correctly in a script, I run the command individually in an interactive shell. This method works much better than trying fancy tactics with breaks and exits. I go this route because (most of the time) my errors are permissions related, so I’ve been trained to start there.

Have fun checking your statuses, and now it’s time for me to exit.

[ Want to try out Red Hat Enterprise Linux? Download it now for free. ]

Источник

Что такое коды завершения

Что происходит, когда коды завершения не определены

Как использовать коды завершения в Bash-скриптах

Проверяем коды завершения

Создаём собственный код завершения

Как использовать коды завершения в командной строке

Дополнительные коды завершения

Extracting the elusive exit code

Exit status 0

Exit status 1

Exit status 2

Exit status 126

Exit status 127

Exit status 128

Exit status 130

Exit status 255

Wrapping up

Читайте также: