Bug 1750370
Summary: | Ordered Ansible Service is getting timeout and not restarting after 3600 seconds. | ||
---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Satyajit Bulage <sbulage> |
Component: | Automate | Assignee: | Lucy Fu <lufu> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Satyajit Bulage <sbulage> |
Severity: | high | Docs Contact: | Red Hat CloudForms Documentation <cloudforms-docs> |
Priority: | high | ||
Version: | 5.11.0 | CC: | bmidwood, dmetzger, gmccullo, lavenel, mkanoor, mshriver, obarenbo, simaishi, tfitzger |
Target Milestone: | GA | Keywords: | Regression |
Target Release: | 5.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | 5.11.0.24 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-12-13 14:54:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | Bug | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | CFME Core | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1660803 |
Description
Satyajit Bulage
2019-09-09 13:20:03 UTC
Billy is going to look into this. We looked at the reproducer environment and max_ttl is not specified. Max_ttl should be specified when playbooks are expected to take some time. The max_ttl(maximum time to live) is used to calculate a retry interval for the playbook service state machine. The retry interval defaults to 1 minute when this value is not specified. Worked on latest 5.10 but failing on 5.11.0.23. Job ran for about an hour and then after 58 retires this error was encountered "job aborting, ansible playbook has been running longer than timeout" Error is coming from ansible_runner_workflow.rb line 61. Lucy is looking into this now New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/861cc9990fac39fd0edf3841c9accde4710fcf35 commit 861cc9990fac39fd0edf3841c9accde4710fcf35 Author: Lucy Fu <lufu> AuthorDate: Tue Sep 10 10:00:17 2019 -0400 Commit: Lucy Fu <lufu> CommitDate: Tue Sep 10 10:00:17 2019 -0400 Set default timeout to 100 minutes for playbook service and playbook method. https://bugzilla.redhat.com/show_bug.cgi?id=1750370 app/models/manageiq/providers/embedded_ansible/automation_manager/configuration_script.rb | 5 +- app/models/manageiq/providers/embedded_ansible/automation_manager/playbook_runner.rb | 2 +- spec/models/manageiq/providers/embedded_ansible/automation_manager/configuration_script_spec.rb | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) New commit detected on ManageIQ/manageiq/ivanchuk: https://github.com/ManageIQ/manageiq/commit/55bc9be7150764c2fa69037d781308bedc7c8327 commit 55bc9be7150764c2fa69037d781308bedc7c8327 Author: Jason Frey <jfrey> AuthorDate: Wed Sep 11 12:25:56 2019 -0400 Commit: Jason Frey <jfrey> CommitDate: Wed Sep 11 12:25:56 2019 -0400 Merge pull request #19279 from lfu/ansible_runner_timeout_1750370 Set default playbook service timeout to 100 minutes (cherry picked from commit e1e730fb136f2702a02b04f815eb93f94934a426) https://bugzilla.redhat.com/show_bug.cgi?id=1750370 app/models/manageiq/providers/embedded_ansible/automation_manager/configuration_script.rb | 5 +- app/models/manageiq/providers/embedded_ansible/automation_manager/playbook_runner.rb | 2 +- spec/models/manageiq/providers/embedded_ansible/automation_manager/configuration_script_spec.rb | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) Playbook having sleep time 60mins executed without any fail. No errors occurred. Verified Version: 5.11.0.24.20190911182429_55bc9be |