Bug 1822576

Summary: "openstack overcloud update run" doesn't return error, even if not able to connect to yum repository and yum metadata cache expires.
Product: Red Hat OpenStack Reporter: Ryo Hayakawa <rhayakaw>
Component: openstack-tripleo-heat-templatesAssignee: mbollo
Status: CLOSED ERRATA QA Contact: mbollo
Severity: low Docs Contact:
Priority: low    
Version: 13.0 (Queens)CC: aschultz, jpretori, mburns, rbrady, rrasouli, sathlang
Target Milestone: z13Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.4.1-65.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-28 18:23:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryo Hayakawa 2020-04-09 11:58:27 UTC
Description of problem:

"openstack overcloud update run" doesn't return error during updating OSP13, even if not able to connect to yum repository and yum metadata cache expires on overcloud nodes.

Please see the package_update.log-20200107[1] in c#1, line 1251 (2020-01-06 13:12:42,623 p=24225 u=mistral).
This is the log file of when my customer ran "openstack overcloud update run". At that time, overcloud nodes were not able to connect yum repository by a network trouble and their yum repositories were expired.

The result of the ansible task "[Is docker going to be updated]" was logged from the line 1251.
The line 1258(2020-01-06 13:13:38,591 p=24225 u=mistral) shows:
   stderr: (shows some errors of "yum check-update docker")
   changed: false
   failed_when_result: false

The above means the "yum check-update docker" returned 0 in spite of failure. You can understand the reason from the folowing task definition.

puppet/services/docker.yaml:
  132           - name: Is docker going to be updated
  133             shell: yum check-update docker
  134             register: docker_check_update
  135             failed_when: docker_check_update.rc not in [0, 100]
  136             changed_when: docker_check_update.rc == 100

I think that this ansible task had to fail. That is because "yum check-update docker" failed in fact(it returned 0, though).

Or I think the "yum check-update docker" had to return value other than 0 or 100.

I realized that when the yum metadata is expired and network is disconnected, "yum check-update" return 0. So, I created a bugzilla 1807887(please see: https://bugzilla.redhat.com/show_bug.cgi?id=1807887), but this BZ won't go ahead because of [2] in c#1.

Therefore, I would like to have this issue fixed on OSP13 if possible. I suppose that if "yum clean metadata" is executed before the "Is docker going to be updated" task, this may be fixed.

Version-Release number of selected component (if applicable):

openstack-tripleo-heat-templates-8.3.1-54.el7ost.noarch

How reproducible:
under certain conditions

Steps to Reproduce:

Run "openstack overcloud update run" under the condition that overcloud nodes are not able to connect to yum repository and their yum metadata cache expires.

Actual results:
"openstack overcloud update run" doesn't return error.

Expected results:
"openstack overcloud update run" return error.

Comment 2 Ryan Brady 2020-04-13 13:13:35 UTC
Daniel, please take a look at this bug.  There's a quick suggested fix for adding an additional task.

Comment 8 mbollo 2020-07-28 08:08:01 UTC
The patch is ready[1] waiting for review.

[1] https://review.opendev.org/#/c/736699/

Comment 22 errata-xmlrpc 2020-10-28 18:23:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13.0 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4388