Bug 1995661

Summary: Remote execution status is successful for any ansible based jobs even if the actual job execution has failed on the host in Satellite 6.10
Product: Red Hat Satellite Reporter: Sayan Das <saydas>
Component: Ansible - Configuration ManagementAssignee: Adam Ruzicka <aruzicka>
Status: CLOSED ERRATA QA Contact: Danny Synk <dsynk>
Severity: high Docs Contact:
Priority: medium    
Version: 6.10.0CC: aruzicka, lstejska, lvrtelov, oezr, osousa, pcreech, zhunting
Target Milestone: 6.10.0Keywords: Triaged
Target Release: Unused   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: tfm-rubygem-foreman_ansible_core-4.1.3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 14:13:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sayan Das 2021-08-19 15:14:21 UTC
Description of problem:

Remote execution status is successful for ansible based jobs even if the actual ansible-playbook execution has failed on the host 


Version-Release number of selected component (if applicable):

Satellite 6.10


How reproducible:

Always


Steps to Reproduce:
1. Build a Satellite 6.10 and register a host with the satellite

2. Setup REX keys with that host

3. Run any "Ansible COmmand" or "Ansible Playbook" based jobs or "Ansible roles" on the host 


Actual results:


Job Result shows 100% success

But if i click on each host entry to see what happened, I could see failures etc.


   1:

   2:
PLAY [all] *********************************************************************
   3:

   4:
TASK [Gathering Facts] *********************************************************
   5:
fatal: [host.example.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname host.example.com: Name or service not known", "unreachable": true}
   6:
PLAY RECAP *********************************************************************
   7:
host.example.com : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   
   8:
Exit status: 0


Expected results:


Result should reflect the correct status


Additional info:


The behavior is not observed when using non-ansible type jobs like the "Commands"  category.

Comment 4 Sayan Das 2021-08-19 15:47:39 UTC
I tested again on Sat 6.7 and cannot reproduce this behavior there.

Comment 6 Adam Ruzicka 2021-08-20 06:40:16 UTC
Created redmine issue https://projects.theforeman.org/issues/33313 from this bug

Comment 7 Bryan Kearney 2021-09-06 16:05:27 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/33313 has been resolved.

Comment 10 Danny Synk 2021-10-14 14:22:32 UTC
Verified on Satellite 6.10, snap 23 (tfm-rubygem-foreman_ansible_core-4.2.0-1.el7sat.noarch).

Steps to Test:
1. Register a host to Satellite 6.10. Do not copy the foreman-proxy public key to the host.
2. Attempt to execute an Ansible remote job against the host.

Expected Results:
The host is reported as unreachable, the exit status for the job is not zero, and the job status is reported as failed on the Monitory > Jobs page of the webUI.

Actual Results:
The host is reported as unreachable and the exit status for the job is not zero:

~~~
proxy_output:
  result:
  - output_type: stdout
    output: "[WARNING]: Callback disabled by environment. Disabling the Foreman callback\r\nplugin.\n"
    timestamp: 1634220424.3178253
  - output_type: stdout
    output: "\n"
    timestamp: 1634220424.317921
  - output_type: stdout
    output: "\r\nPLAY [all] *********************************************************************\n"
    timestamp: 1634220424.3179832
  - output_type: stdout
    output: "\r\nTASK [Gathering Facts] *********************************************************\n"
    timestamp: 1634220424.3180554
  - output_type: stdout
    output: "\n"
    timestamp: 1634220424.3181298
  - output_type: stdout
    output: 'fatal: [host.example.com]: UNREACHABLE! => {"changed":
      false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added
      ''host.example.com,0.0.0.0'' (ECDSA) to the list of known
      hosts.\r\nPermission denied (publickey,gssapi-keyex,gssapi-with-mic,password).",
      "unreachable": true}

'
    timestamp: 1634220424.3182552
  - output_type: stdout
    output: |-
      PLAY RECAP *********************************************************************
      host.example.com : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
    timestamp: 1634220424.3184118
  exit_status: 1
~~~

The job status is also reported as failed on the Monitor > Jobs page of the webUI.

Comment 13 errata-xmlrpc 2021-11-16 14:13:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.10 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4702