This bug was introduced when we fixed https://bugzilla.redhat.com/show_bug.cgi?id=1583851 Which used to have a default timeout of 300 seconds (5 minutes) By default when you don't set the execution_ttl it is stored as an empty string. We convert the empty string to an integer to get the timeout, which results in the timeout being 0 seconds When the Scheduler wakes up it tries to look for stuck jobs and calls timeout! on these jobs. if the timeout is 0 seconds the job gets terminated right away. As a workaround you can always fill the execution_ttl to 600 when adding a new method or updating an existing method. We are working on a fix this issue so that it treats an empty string to use the default_timeout which is 600 seconds.
https://github.com/ManageIQ/manageiq/pull/17715
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/40abd08a42b1d4aacf4ba027df9d2ec997708e4b commit 40abd08a42b1d4aacf4ba027df9d2ec997708e4b Author: Madhu Kanoor <mkanoor> AuthorDate: Mon Jul 16 16:42:59 2018 -0400 Commit: Madhu Kanoor <mkanoor> CommitDate: Mon Jul 16 16:42:59 2018 -0400 Allow for empty strings in the execution_ttl field https://bugzilla.redhat.com/show_bug.cgi?id=1601538 An empty string yields a 0 timeout value causing jobs to be terminated right away. app/models/manageiq/providers/embedded_ansible/automation_manager/playbook_runner.rb | 2 +- spec/models/manageiq/providers/embedded_ansible/automation_manager/playbook_runner_spec.rb | 8 + 2 files changed, 9 insertions(+), 1 deletion(-)
You would need to have an ansible playbook that can sleep for a set amount of time. There is a sample playbook here https://github.com/mkanoor/playbook/blob/master/pkg_info.yaml It takes in 3 parameters user sleep pkg and you can set the sleep time to different time in seconds to see the timeout behaviour. This problem manifests itself when the scheduler is looking for stuck jobs that are not responding and tries to terminate them.
Verification Steps: 1. Added repository --> https://github.com/mkanoor/playbook/blob/master/pkg_info.yaml 2. Created Generic service for Ansible. 3. Tried different sleep time i.e. 200, 450 and 750 (values in sec.) 4. Service finished with error for value 750 5. Service is not failed for values within 600 sec. Verified Version: 5.10.0.23.20181106165157_92dd189