Ansible обработка ошибок - Исправление ошибок и поиск оптимальных решений проблем

Ansible Error handling – In this lesson, you will learn the different ways of how to handle failures in Ansible tasks by using the ignore_errors, force_handlers, Ansible blocks, Ansible rescue, and Ansible always directives in a playbook.

Contents

How Can You Handle Error In Ansible
Specifying Task Failure Conditions
Managing Changed Status
Using Ansible Blocks
Using Ansible Blocks With Rescue and Always Statement

How Can Error Handling Be Done In Ansible

Ansible plays and tasks are executed in the order they are defined in a playbook, and by default, if a task fails, the other tasks will not be executed in that order. However, this behavior can be changed with the use of a keyword called “ignore_errors: true“.

This keyword can be added to a play or a task as the case may be. If it is added to a play, it means that all the errors in the tasks associated to a play will be ignored. More so, if it is added to a task, it means all the errors in the task will be ignored.

Well, we learnt about handlers in one of our previous lessons, what about handlers? Yes, this is also applicable to handlers, handlers error can be handled by the keyword, “force_handlers:yes”.

If a task that is supposed to notify a handler fails, the handlers will not be executed. This behavior can also be changed by using the keyword, “force_handler: yes” .

As usual, let’s understand better with examples.

If we were to install the httpd package and the autofs package using a playbook, the name argument will consist of the “httpd” value and the “autofs” value.

Now, we are going to write a playbook, and we will intentionally make an error by making the name argument of autofs containing “autos“.

1. Create a playbook

[lisa@drdev1 ~]$ vim playbook7.yml

- name: Install basic package
  hosts: hqdev1.tekneed.com
  tasks:
     - name: install autofs
       yum:
         name: autos
         state: present
       ignore_errors: true

     - name: install httpd
       yum:
         name: httpd
         state: present

2. Run the playbook

[lisa@drdev1 ~]$ ansible-playbook playbook7.yml

PLAY [Install basic package] ***********************************************************
........

This playbook we run should have resulted in an error or task failure, and the second task shouldn’t have run because the first task failed, but because we used the “ignore_errors” keyword, the task did not fail and the play run.

Also, remember that you can choose to use the “ignore_errors” keyword at a play level or at a task level. In our case, it was used at the task level.

Again, let’s see an example of how we can use the force_handlers keyword to forcefully run a task with handlers.

1. create a playbook

- name: restarting httpd using handlers
  hosts: hqdev1.tekneed.com
  force_handlers: yes
  tasks:
    - name: restart httpd
      service:
        name: httpd
        state: restarted
      notify: restart httpd

  handlers:
    - name: restart httpd
      service:
        name: httpd
        state: restarted

The force_handler directive will always force handlers to run whether it is called or not. Note that the keyword can only be used at a play level

2. Run the playbook

[lisa@drdev1 ~]$ ansible-playbook playbook3.yml

PLAY [restarting httpd using handlers] ********************************
......

Because the force_handlers directive is set to yes, the handler will always run.

Specifying Task Failure conditions

Ansible may run a task/command successfully, however may be a failure due to the final result a user desires to get. In this sense, one can specify a condition for tasks to fail or in other words, you are at liberty to determine what a failure is.

Let’s see how this can be done by using the “failed_when” directive. The failed_when directive, from its word, simply means the task should fail when a condition is met or not met.

Let’s use the playbook below as an example,

[lisa@drdev1 ~]$ vim playbook5.yml

- name: Web page fetcher
  hosts: hqdev1.tekneed.com
  tasks:
    - name: connect to website
      uri:
        url: https://tekneed.com
        return_content: true
      register: output

    - name: verify content
      debug:
        msg: "verifying content"
      failed_when:
        - '"this content" not in output.content'
        - '"other content" not in output.content'

This is what this playbook will do. The uri module will interact with the webserver and fetch the page, https://www.tekneed.com.
More so, With the true value for the return_content argument, the body of the response of https://tekneed.com will be returned as content, and output will be captured with the aid of the register directive.

For the second task, the content “verifying content” will be printed by the debug module with the aid of the msg argument, and with the “failed_when” directive, the task will fail when the string, “this content” and “other content” is not in the captured output.

Run the playbook

[lisa@drdev1 ~]$ ansible-playbook playbook5.yml

PLAY [Web page fetcher] ************************************************************
.......

Alternatively, the “fail” module can be used to specify when a task fails. This module can only be very useful if the “when” keyword is used to specify the exact failure condition.

Let’s use the same example we used above, but this time around use the “fail” module and “when” keyword

[lisa@drdev1 ~]$ vim playbook5.yml

- name: Web page fetcher
  hosts: hqdev1.tekneed.com
  tasks:
    - name: connect to website
      uri:
        url: https://tekneed.com
        return_content: true
      register: output

    - name: verify content
      fail:
        msg: "verifying content"
      when:
        - '"this content" not in output.content'
        - '"other content" not in output.content'

Run the playbook

[lisa@drdev1 ~]$ ansible-playbook playbook5.yml

PLAY [Web page fetcher] ************************************************************
.......

Managing Changed Status

Managing a changed status can be useful in avoiding unexpected results while running a playbook. Some tasks may even report a changed status and nothing in the real sense has really changed. It could just be that information was retrieved.

In some cases, you may not want a task to result in a changed status. In this case, one needs to use the “changed_when: false” keyword. The playbook will only report “ok” or “failed” status, and will never report a changed status.

An example of such task can be seen below.

- name: copy in nginx conf
  template: src=nginx.conf.j2
            dest=/etc/nginx/nginx.conf

- name: validate nginx conf
  command: nginx -t
  changed_when: false

One can also specify a condition when a task should change, an example of such task is seen in the playbook below.

- name: Web page fetcher
  hosts: hqdev1.tekneed.com
  tasks:
    - name: connect to website
      uri:
        url: https://tekneed.com
        return_content: true
      register: output

    - name: verify content
      fail:
        msg: "verifying content"
      changed_when: "'success' in output.stdout"

Using Ansible Blocks

Blocks are used to group tasks, specific tasks that are related, and can be very useful with a conditional statement.

If tasks are grouped conditionally, and the conditions is/are true, all the tasks will be executed. You should also know that block is a directive in Ansible and not a module, hence the block directive and the when directive will be at the same indentation level.

Let’s see how blocks can be used with examples.

create a playbook

[lisa@drdev1 ~]$ vim playbook8.yml

- name: setting up httpd
  hosts: localhost
  tasks:
    - name: Install start and enable httpd
      block:
      - name: install httpd
        yum:
          name: httpd
          state: present
      - name: start and enable httpd
        service:
          name: httpd
          state: started
          enabled: true
      when: ansible_distribution == "Red Hat"

This playbook will group only two tasks in a block(install and enable httpd). The first task name is “install httpd” while the second task name is “start and enable httpd“. These tasks will only execute if the condition is true, which is, the OS ansible will execute against is/are Red Hat.

Run the playbook

[lisa@drdev1 ~]$ ansible-playbook playbook8.yml

PLAY [setting up httpd] ********************************************************
.......

With this kind of playbook, if the condition fails, other tasks will not be executed. Let’s see how we can use block with rescue and always if we don’t want this type of condition.

Using Ansible block With rescue and always Statement

Apart from blocks being used to group different tasks, they can also be used specifically for error handling with the rescue keyword.

It works in a way that; if a task that is defined in a block fails, the tasks defined in the rescue section will be executed. This is also similar to the ignore_errors keyword.

More so, there is also an always section, this section will always run either the task fails or not. This is also similar to the ignore_errors keyword.

Let’s understand better with examples.

create a playbook.

[lisa@drdev1 ~]$ vim playbook9.yml

- name: setting up httpd
  hosts: localhost
  tasks:
    - name: Install the latest httpd and restart
      block:
        - name: install httpd
          yum:
            name: htt
            state: latest
      rescue:
        - name: restart httpd
          service:
            name: httpd
            state: started
        - name: Install autofs
          yum:
            name: autofs
      always:
        - name: restart autofs
          service:
            name: autofs
            state: started

This playbook will group four tasks in a block(install the latest httpd and restart).
The first task in the block section will fail because the name of the package is incorrect. However, the second and third tasks will run because they are in the rescue section. More so, the fourth task will run because it is in the always section.

Note that you can have as many tasks you want in the block section, rescue section or always section

Run the playbook

[lisa@drdev1 ~]$ ansible-playbook playbook9.yml

PLAY [setting up httpd] *********************************************************
......

Class Activity

create a playbook that contains 1 play with tasks using block, always and rescue statements

If you like this article, you can support us by

1. sharing this article.

2. Buying the article writer a coffee (click here to buy a coffee)

3. Donating to push our project to the next level. (click here to donate)

If you need personal training, send an email to info@tekneed.com

Click To Watch Video On Ansible Error Handling

RHCE EX294 Exam Practice Question On Ansible Error Handling

Suggested: Managing Layered Storage With Stratis – Video

Your feedback is welcomed. If you love others, you will share with others

Источник

Topics

Error Handling In Playbooks
- Ignoring Failed Commands
- Resetting Unreachable Hosts
- Handlers and Failure
- Controlling What Defines Failure
- Overriding The Changed Result
- Aborting the play
- Using blocks

Ansible normally has defaults that make sure to check the return codes of commands and modules and
it fails fast – forcing an error to be dealt with unless you decide otherwise.

Sometimes a command that returns different than 0 isn’t an error. Sometimes a command might not always
need to report that it ‘changed’ the remote system. This section describes how to change
the default behavior of Ansible for certain tasks so output and error handling behavior is
as desired.

Ignoring Failed Commands¶

Generally playbooks will stop executing any more steps on a host that has a task fail.
Sometimes, though, you want to continue on. To do so, write a task that looks like this:

- name: this will not be counted as a failure
  command: /bin/false
  ignore_errors: yes

Note that the above system only governs the return value of failure of the particular task,
so if you have an undefined variable used or a syntax error, it will still raise an error that users will need to address.
Note that this will not prevent failures on connection or execution issues.
This feature only works when the task must be able to run and return a value of ‘failed’.

Resetting Unreachable Hosts¶

New in version 2.2.

Connection failures set hosts as ‘UNREACHABLE’, which will remove them from the list of active hosts for the run.
To recover from these issues you can use meta: clear_host_errors to have all currently flagged hosts reactivated,
so subsequent tasks can try to use them again.

Handlers and Failure¶

When a task fails on a host, handlers which were previously notified
will not be run on that host. This can lead to cases where an unrelated failure
can leave a host in an unexpected state. For example, a task could update
a configuration file and notify a handler to restart some service. If a
task later on in the same play fails, the service will not be restarted despite
the configuration change.

You can change this behavior with the --force-handlers command-line option,
or by including force_handlers: True in a play, or force_handlers = True
in ansible.cfg. When handlers are forced, they will run when notified even
if a task fails on that host. (Note that certain errors could still prevent
the handler from running, such as a host becoming unreachable.)

Controlling What Defines Failure¶

Ansible lets you define what “failure” means in each task using the failed_when conditional. As with all conditionals in Ansible, lists of multiple failed_when conditions are joined with an implicit and, meaning the task only fails when all conditions are met. If you want to trigger a failure when any of the conditions is met, you must define the conditions in a string with an explicit or operator.

You may check for failure by searching for a word or phrase in the output of a command:

- name: Fail task when the command error output prints FAILED
  command: /usr/bin/example-command -x -y -z
  register: command_result
  failed_when: "'FAILED' in command_result.stderr"

or based on the return code:

- name: Fail task when both files are identical
  raw: diff foo/file1 bar/file2
  register: diff_cmd
  failed_when: diff_cmd.rc == 0 or diff_cmd.rc >= 2

In previous version of Ansible, this can still be accomplished as follows:

- name: this command prints FAILED when it fails
  command: /usr/bin/example-command -x -y -z
  register: command_result
  ignore_errors: True

- name: fail the play if the previous command did not succeed
  fail:
    msg: "the command failed"
  when: "'FAILED' in command_result.stderr"

You can also combine multiple conditions for failure. This task will fail if both conditions are true:

- name: Check if a file exists in temp and fail task if it does
  command: ls /tmp/this_should_not_be_here
  register: result
  failed_when:
    - result.rc == 0
    - '"No such" not in result.stdout'

If you want the task to fail when only one condition is satisfied, change the failed_when definition to:

failed_when: result.rc == 0 or "No such" not in result.stdout

If you have too many conditions to fit neatly into one line, you can split it into a multi-line yaml value with >:

- name: example of many failed_when conditions with OR
  shell: "./myBinary"
  register: ret
  failed_when: >
    ("No such file or directory" in ret.stdout) or
    (ret.stderr != '') or
    (ret.rc == 10)

Overriding The Changed Result¶

When a shell/command or other module runs it will typically report
“changed” status based on whether it thinks it affected machine state.

Sometimes you will know, based on the return code
or output that it did not make any changes, and wish to override
the “changed” result such that it does not appear in report output or
does not cause handlers to fire:

tasks:

  - shell: /usr/bin/billybass --mode="take me to the river"
    register: bass_result
    changed_when: "bass_result.rc != 2"

  # this will never report 'changed' status
  - shell: wall 'beep'
    changed_when: False

You can also combine multiple conditions to override “changed” result:

- command: /bin/fake_command
  register: result
  ignore_errors: True
  changed_when:
    - '"ERROR" in result.stderr'
    - result.rc == 2

Aborting the play¶

Sometimes it’s desirable to abort the entire play on failure, not just skip remaining tasks for a host.

The any_errors_fatal option will end the play and prevent any subsequent plays from running. When an error is encountered, all hosts in the current batch are given the opportunity to finish the fatal task and then the execution of the play stops. any_errors_fatal can be set at the play or block level:

- hosts: somehosts
  any_errors_fatal: true
  roles:
    - myrole

- hosts: somehosts
  tasks:
    - block:
        - include_tasks: mytasks.yml
      any_errors_fatal: true

for finer-grained control max_fail_percentage can be used to abort the run after a given percentage of hosts has failed.

Using blocks¶

Most of what you can apply to a single task (with the exception of loops) can be applied at the Blocks level, which also makes it much easier to set data or directives common to the tasks.
Blocks also introduce the ability to handle errors in a way similar to exceptions in most programming languages.
Blocks only deal with ‘failed’ status of a task. A bad task definition or an unreachable host are not ‘rescuable’ errors:

tasks:
- name: Handle the error
  block:
    - debug:
        msg: 'I execute normally'
    - name: i force a failure
      command: /bin/false
    - debug:
        msg: 'I never execute, due to the above task failing, :-('
  rescue:
    - debug:
        msg: 'I caught an error, can do stuff here to fix it, :-)'

This will ‘revert’ the failed status of the outer block task for the run and the play will continue as if it had succeeded.
See Blocks error handling for more examples.

Источник

Reading Time: 3 minutes

Hello readers, in this blog we will be looking at how to handle errors in Ansible Playbooks. There are multiple ways for doing the same and we will be looking at them and how to use it in our Playbook.

By default, Ansible will check the return codes of commands and modules and it fails fast. This means that we will be forced to deal with these failures by default until we decide otherwise.

Let us start by looking how to change the default behaviour of Ansible for certain tasks so that error handling behaviour is as per our requirements.

How to Ignore Failed Commands?

Ansible playbooks stop the execution of any more tasks on a host which has encountered any failures. But in some cases, even after a failure, we might want to continue executing tasks on that host. So, we will have to write tasks that look like the one below:

- name: some task
  command: /bin/false
  ignore_errors: yes

Above all, we will have to keep in mind that this feature will only work when the task is able to run and return a value associated with failure. So, if we have any undefined variables or syntax errors, we will still get an error which we will have to address. Also, it will not prevent connection or execution issues.

How to Reset Unreachable Hosts?

Whenever an Ansible Playbook encounters connection failure with a host, it sets the host as ‘UNREACHABLE’. By doing this, Ansible removes this host from the list of active hosts for the run. To reset this list, we can use meta:clear_host_errors to reactivate all the hosts associated with play. This makes the tasks can try to use them again. We can use this in the same way as below:

- hosts: all
  tasks:
    - set_fact:
        was_accessible: "up"

    - meta: clear_host_errors

    - debug:
        msg: "Hello"

    - when:
        - was_accessible is defined
      debug:
        msg: "Hello again, I am up."

Running Handlers Despite Failures

Handlers associated with a task will not run on hosts on which the task has failed. As a result, a host is left in an unexpected state even though the failures are unrelated.

To tackle this problem, we can use the following options:
1. Using –force-handlers command line option
2. Including force_handlers: True in a play
3. Setting force_handlers=True in ansible.cfg configuration file.

- hosts: all
  force_handlers: true

When we force handlers to run, the handlers will run when notified even if a task has failed on the host.

How to Define Failures?

Ansible provides failed_when conditional to allow us to define what “failure” means. Multiple failed_when can be joined using and that requires that a task is marked as failed only when all the failed_when conditions are met. To register a failure when any one of our multiple conditions are met, we can use or operator.

- name: Web page fetcher
  hosts: all

  tasks:
    - name: Fetch webpage
      uri:
        url: https://somewebsite.com
        return_content: true
     register: output

    - name: Check Content
      debug:
        msg: "Checking content..."
    failed_when:
      - '"Some Content" not in output.content'
      - '"Some other content" not in output.content'

failed_when: output.number == 0 or "No such" not in output.stdout

How to Abort a Play?

When there are failures in a play, sometimes it is essential to abort the entire play instead of just skipping a task. In this scenario, we will have to use the any_errors_fatal option. This option will prevent the play and any subsequent plays from running. In the case of a failure, hosts situated in the current batch are given the opportunity to finish the fatal task and after that the execution of the play is stopped.

We can use this option in the way given below:

- hosts: somehosts
  any_errors_fatal: true

Conclusion

We have seen throughout this blog that there are multiple ways to handle errors in ansible playbooks. We looked that we can also define what “failure” means in our playbooks and what are the various actions we can perform when we encounter them!

References

https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html

Источник

How Can Error Handling Be Done In Ansible

Specifying Task Failure conditions

Managing Changed Status

Using Ansible Blocks

Using Ansible block With rescue and always Statement

Class Activity

Click To Watch Video On Ansible Error Handling

RHCE EX294 Exam Practice Question On Ansible Error Handling

Ignoring Failed Commands¶

Resetting Unreachable Hosts¶

Handlers and Failure¶

Controlling What Defines Failure¶

Overriding The Changed Result¶

Aborting the play¶

Using blocks¶

How to Ignore Failed Commands?

How to Reset Unreachable Hosts?

Running Handlers Despite Failures

How to Define Failures?

How to Abort a Play?

Conclusion

References

Читайте также: