Description of problem: When a RHOSP overcloud deploy or update runs, in presence of ceph nodes, the ceph-ansible part seems to take a very long time while apparently is in pause. Business impact: overcloud deploy increased time to completion (hours slower). Version-Release number of selected component (if applicable): 3.3 How reproducible: overcloud deploy Actual results: Sample run: -------------------------------------------- 2020-03-27 01:19:21,595 p=31193 u=mistral | INSTALLER STATUS *************************************************************** 2020-03-27 01:19:21,598 p=31193 u=mistral | Install Ceph Monitor : Complete (0:03:25) 2020-03-27 01:19:21,599 p=31193 u=mistral | Install Ceph Manager : Complete (0:02:03) 2020-03-27 01:19:21,599 p=31193 u=mistral | Install Ceph OSD : Complete (0:02:58) 2020-03-27 01:19:21,599 p=31193 u=mistral | Install Ceph RGW : Complete (0:01:52) 2020-03-27 01:19:21,599 p=31193 u=mistral | Install Ceph Client : Complete (1:14:17) 2020-03-27 01:19:21,600 p=31193 u=mistral | Friday 27 March 2020 01:19:21 +0100 (0:00:00.176) 2:29:00.288 ********** 2020-03-27 01:19:21,600 p=31193 u=mistral | =============================================================================== -------------------------------------------- Expected results: Understand if this depends on other components. Please analyze the provided data and suggest linked upstream/github bugs, if it's ceph-ansible, if this is fixed already, if we can ask for a hotfix. Some tunings applied (i.e. increased FORK to 100), but not helping definitely.
Is https://github.com/ceph/ceph-ansible/pull/5213 the relevant fix? If so, can this bug move forward?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2488