Bug 1952482 - RHOSP 13 to 16.1 Upgrades] nova_hybrid_state task should check for the running container and image instead checking file docker-container-hybrid_nova_compute.json
Summary: RHOSP 13 to 16.1 Upgrades] nova_hybrid_state task should check for the runnin...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z6
: 16.1 (Train on RHEL 8.2)
Assignee: Lukas Bezdicka
QA Contact: Jose Luis Franco
URL:
Whiteboard:
: 1953234 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-22 11:09 UTC by Shravan Kumar Tiwari
Modified: 2022-08-02 13:53 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20210408163452.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-26 13:52:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 787698 0 None MERGED [ffwd] Rework checks for hybrid state containers 2021-04-30 09:24:15 UTC
Red Hat Issue Tracker OSP-3244 0 None None None 2022-08-02 13:53:09 UTC
Red Hat Product Errata RHBA-2021:2097 0 None None None 2021-05-26 13:53:08 UTC

Description Shravan Kumar Tiwari 2021-04-22 11:09:41 UTC
Description of problem:

It is observed that if the The first time the nova_hybrid_state upgrade fails because of some reasons and if /var/lib/tripleo-config/docker-container-hybrid_nova_compute.json is created (Because of this paunch apply did not execute on all the compute nodes of that specific role)

Then re-running the hybrid state step again caused the paunch taks to be skipped as the docker-container-hybrid_nova_compute.json file was already present due to earlier failed execution.


2021-04-20 10:04:14,879 p=33118 u=mistral n=ansible | TASK [Apply paunch config for nova_compute] ************************************
2021-04-20 10:04:14,880 p=33118 u=mistral n=ansible | Tuesday 20 April 2021  10:04:14 +0200 (0:00:00.685)       0:02:13.952 *********
2021-04-20 10:04:14,931 p=33118 u=mistral n=ansible | skipping: [com018] => {"changed": false, "skip_reason": "Conditional result was False"}
2021-04-20 10:04:14,932 p=33118 u=mistral n=ansible | skipping: [com019] => {"changed": false, "skip_reason": "Conditional result was False"}
2021-04-20 10:04:14,953 p=33118 u=mistral n=ansible | skipping: [com026] => {"changed": false, "skip_reason": "Conditional result was False"}
2021-04-20 10:04:14,973 p=33118 u=mistral n=ansible | skipping: [com027] => {"changed": false, "skip_reason": "Conditional result was False"}


This has caused the issue later as hybrid state steps got executed without any error skipping the paunch task and then tenant VMs started getting issue of volume detach/attach, even creation of VM and boot also experienced the issue

Actual results:



Expected results:

There should be more optimal approach to handle such situation or to Change the stat into check for running container and image instead of checking the file.

Additional info:

Comment 7 Lukas Bezdicka 2021-05-05 10:41:02 UTC
*** Bug 1953234 has been marked as a duplicate of this bug. ***

Comment 8 Jose Luis Franco 2021-05-05 13:52:21 UTC
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv6-geneve-HA-no-ceph-ovn-dvr/94/undercloud-0/home/stack/overcloud_upgrade_run-controller-0.log.gz

2021-05-04 06:40:49 | TASK [Check if iscsid is running with proper image] ****************************
2021-05-04 06:40:49 | Tuesday 04 May 2021  06:40:44 +0000 (0:00:00.496)       0:00:55.409 *********** 
2021-05-04 06:40:49 | changed: [compute-0] => {"changed": true, "cmd": "docker ps | grep \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-iscsid:16.1_20210430.1\"\n", "delta": "0:00:00.043251", "end": "2021-05-04 06:40:45.146092", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 06:40:45.102841", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2021-05-04 06:40:49 | changed: [compute-1] => {"changed": true, "cmd": "docker ps | grep \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-iscsid:16.1_20210430.1\"\n", "delta": "0:00:00.050798", "end": "2021-05-04 06:40:45.178955", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 06:40:45.128157", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

............................................................

2021-05-04 06:41:15 | TASK [Apply paunch config for iscsid] ******************************************
2021-05-04 06:41:15 | Tuesday 04 May 2021  06:40:50 +0000 (0:00:00.695)       0:01:00.962 *********** 
2021-05-04 06:41:15 | changed: [compute-0] => {"changed": true, "cmd": "paunch apply --file /var/lib/tripleo-config/docker-container-hybrid_iscsid.json --config-id hybrid_iscsid", "delta": "0:00:11.657153", "end": "2021-05-04 06:41:02.306083", "rc": 0, "start": "2021-05-04 06:40:50.648930", "stderr": "", "stderr_lines": [], "stdout": "Did not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--filter', 'label=config_id=hybrid_iscsid', '--format', '{{.Names}}']\" - retrying without config_id\nDid not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--format', '{{.Names}}']\"", "stdout_lines": ["Did not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--filter', 'label=config_id=hybrid_iscsid', '--format', '{{.Names}}']\" - retrying without config_id", "Did not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--format', '{{.Names}}']\""]}
2021-05-04 06:41:15 | 
2021-05-04 06:41:15 | changed: [compute-1] => {"changed": true, "cmd": "paunch apply --file /var/lib/tripleo-config/docker-container-hybrid_iscsid.json --config-id hybrid_iscsid", "delta": "0:00:11.910811", "end": "2021-05-04 06:41:02.564756", "rc": 0, "start": "2021-05-04 06:40:50.653945", "stderr": "", "stderr_lines": [], "stdout": "Did not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--filter', 'label=config_id=hybrid_iscsid', '--format', '{{.Names}}']\" - retrying without config_id\nDid not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--format', '{{.Names}}']\"", "stdout_lines": ["Did not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--filter', 'label=config_id=hybrid_iscsid', '--format', '{{.Names}}']\" - retrying without config_id", "Did not find container with \"['docker', 'ps', '-a', '--filter', 'label=container_name=iscsid', '--format', '{{.Names}}']\""]}
2021-05-04 06:41:15 | 
2021-05-04 06:41:15 | TASK [Check if nova_compute is running with proper image] **********************
2021-05-04 06:41:15 | Tuesday 04 May 2021  06:41:02 +0000 (0:00:12.183)       0:01:13.146 *********** 
2021-05-04 06:41:15 | changed: [compute-0] => {"changed": true, "cmd": "docker ps | grep \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute:16.1_20210430.1\"\n", "delta": "0:00:00.033059", "end": "2021-05-04 06:41:02.855364", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 06:41:02.822305", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2021-05-04 06:41:15 | changed: [compute-1] => {"changed": true, "cmd": "docker ps | grep \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute:16.1_20210430.1\"\n", "delta": "0:00:00.037808", "end": "2021-05-04 06:41:02.883398", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 06:41:02.845590", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2021-05-04 06:41:15 | 
...........................................................

2021-05-04 06:41:57 | TASK [Check if ovn_controller is runing with proper image] *********************
2021-05-04 06:41:57 | Tuesday 04 May 2021  06:41:53 +0000 (0:00:00.092)       0:02:03.696 *********** 
2021-05-04 06:41:57 | changed: [compute-0] => {"changed": true, "cmd": "docker ps | grep \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-controller:16.1_20210430.1\"", "delta": "0:00:00.042160", "end": "2021-05-04 06:41:53.463997", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 06:41:53.421837", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2021-05-04 06:41:57 | changed: [compute-1] => {"changed": true, "cmd": "docker ps | grep \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ovn-controller:16.1_20210430.1\"", "delta": "0:00:00.039445", "end": "2021-05-04 06:41:53.491884", "failed_when_result": false, "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 06:41:53.452439", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2021-05-04 06:41:57 | 

http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv6-geneve-HA-no-ceph-ovn-dvr/94/undercloud-0/var/log/dnf.rpm.log.gz

2021-05-04T05:08:30Z SUBDEBUG Installed: openstack-tripleo-heat-templates-11.3.2-1.20210408163452.el8ost.noarch

Comment 14 errata-xmlrpc 2021-05-26 13:52:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097


Note You need to log in before you can comment on or make changes to this bug.