Bug 2068527
Summary: | Unable to plan ansible tasks with large inventories and concurrency control - undefined method `wait' for nil:NilClass (NoMethodError) | |||
---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Jan Jansky <jjansky> | |
Component: | Ansible - Configuration Management | Assignee: | Adam Ruzicka <aruzicka> | |
Status: | CLOSED ERRATA | QA Contact: | Satellite QE Team <sat-qe-bz-list> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 6.9.7 | CC: | ahumbe, aruzicka, bshahu, dhjoshi, ekohlvan, hakon.gislason, hasingh, ikaur, jbhatia, jkastnin, jsenkyri, juwatts, lstejska, lvrtelov, mkalyat, nmohite, oezr, osousa, patalber, pbadguja, pcreech, pdwyer, peter.vreman, pmendezh, saydas, shwsingh, sigbjorn.lie, smeyer, torkil, wpinheir | |
Target Milestone: | 6.15.0 | Keywords: | Performance, PrioBumpGSS, Triaged | |
Target Release: | Unused | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | rubygem-dynflow-1.8.0, rubygem-foreman-tasks-8.2.0, rubygem-foreman_remote_execution-11.0.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2250346 (view as bug list) | Environment: | ||
Last Closed: | 2024-04-23 17:11:05 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Description
Jan Jansky
2022-03-25 15:11:07 UTC
The json parse error should be fixed by [1], I would expect 6.11 to be free from this bug, although I can't say for certain as I have not managed to reproduce it myself. [1] - https://github.com/ansible/ansible-runner/pull/638 *** Bug 2135790 has been marked as a duplicate of this bug. *** In one of the case of satellite 6.11, user has 3 sidekiq workers and each of them have the exact same default configuration, causing same issue. Upon following solution, manual changes made inside /etc/foreman/dynflow/worker-*.yml files gets overwritten with every installer run ( which is expected ). To resolve this, executing installer option i.e. <--foreman-dynflow-manage-services false> will unmanage dynflow configuration and allow manual modifications of dynflow workers to remain intact. But It's not an acceptable solution for the end-user User rather would like to be able to control the remote_execution queue assignment for individual workers with other method, instead of manually updating the worker configurations. Copying from BZ https://bugzilla.redhat.com/show_bug.cgi?id=2135790 that is closed as duplicated. The issue on Satellite version 6.11 is not only the the undefined method or nilclass, but even worser that it has also tasks of the job in Pending state that cannot be cleaned up by a regular user in the UI. Job execution with concurrency set broken with nilclass at 100 servers out of 116 if batch size >= 100 is having all servers > 100 in a pending task state and the job is hangs/fails. How reproducible: Always. Steps to Reproduce: Job details for reproducer: - Type: SSH Command - Query '*' (make sure > 100 servers are there) - Command: 'date' - Concurrency: 30 Actual results: Job on ~120 servers with concurrency 30 and the following Capsule batch sizes: - 50: ok - 99; ok - 100: FAIL - Pending tasks without progress - 101: FAIL - Pending tasks without progress - 150: FAIL - Pending tasks without progress Expected results: It should not fail Additional info: Issue related to 'Concurrency level limited to: 30 tasks at a time'. If do not set the concurrency level it successfully processes all ~116 servers. If set the concurrency to 30 tasks then it exactly hangs/fails again at ~100 Propagating information from jira back here Bulk setting Target Milestone = 6.15.0 where sat-6.15.0+ is set. *** Bug 2250346 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.15.0 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:2010 |