Bug 2087738 - ovirt-engine is not able to kill hanged ansible-runner process after execution timeout passed
Summary: ovirt-engine is not able to kill hanged ansible-runner process after executio...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: ovirt-host-deploy-ansible
Version: 4.5.0.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.5.1
: ---
Assignee: Dana
QA Contact: Pavol Brilla
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-18 11:53 UTC by Dana
Modified: 2022-06-27 07:10 UTC (History)
5 users (show)

Fixed In Version: ovirt-engine-4.5.1.1
Doc Type: Release Note
Doc Text:
ansible-runner stop command is executed to kill ansibe-runner process after execution timeout. If there is an error during the operation, then we just log the error.
Clone Of:
Environment:
Last Closed: 2022-06-27 07:10:40 UTC
oVirt Team: Infra
Embargoed:
mperina: ovirt-4.5+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 410 0 None Merged don't throw an exception when canceling playbook if play execution was already completed 2022-06-08 09:17:45 UTC
Red Hat Issue Tracker RHV-46092 0 None None None 2022-05-18 11:58:38 UTC

Description Dana 2022-05-18 11:53:12 UTC
Description of problem:
ansible-runner stop <UUID> fails with the following message in the audit log:

Host stream2 installation failed. Failed to execute Ansible host-deploy role: Cannot run program "ansible-runner stop /home/delfassy/ovirt-engine-master-git6/var/lib/ovirt-engine/ansible-runner/8e6d8743-b770-414a-9ef6-e24d88c563b9": error=2, No such file or directory. Please check logs for more details: /home/delfassy/ovirt-engine-master-git6/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20220518143210-192.168.100.194-787ec059-ff73-4121-aa7d-cd4ec80eb473.log.

* host deploy log doesn't have any failure info
* ansible-runner stop failure log is set here- https://github.com/oVirt/ovirt-engine/pull/261/files#diff-b3c46eeff4f2d9d10c5c1c21d106107aad63afa7c2f8c11d365f76dc35e267ce
this file also doesn't contain any info


Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1. set timeout to 2 min. 
2. as timeout is reached, host deploy process executes ansible-runner stop <UUID> 

Actual results:
ansible-runner stop <UUID> fails


Expected results:
ansible-runner stop <UUID> process ends successfully, host deploy fails due to timeout.


Additional info:

Comment 1 Dana 2022-05-18 12:02:42 UTC
I checked the artifacts-
All artifacts exist (last one includes the recap) and stdout file is complete,
so host deploy process has ended and indeed there's nothing to stop.

Comment 2 Pavol Brilla 2022-06-23 10:09:51 UTC
after shortening timeout to just 2 minutes, deploy task is reaching timeout and even message is mirroring it.

Software Version:4.5.1.2-0.11.el8ev
Host test installation failed. Failed to execute Ansible host-deploy: Play execution has reached timeout. Please check logs for more details: /path/to/ansible/playbook.log.


Opening new bug to improve message more as current log doesn't contain more info.


Note You need to log in before you can comment on or make changes to this bug.