Bug 1388543

Summary: Upgrade of openvswitch-2.4.0-1.el7 makes ip disappears. (osp9)
Product: Red Hat OpenStack Reporter: Omri Hochman <ohochman>
Component: openstack-tripleo-heat-templatesAssignee: Marios Andreou <mandreou>
Status: CLOSED ERRATA QA Contact: Alexander Chuzhoy <sasha>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 9.0 (Mitaka)CC: achernet, aloughla, apevec, chrisw, jcoufal, lbezdick, mandreou, mburns, ohochman, rhel-osp-director-maint, rhos-maint, sasha, sathlang, srevivo, tvignaud
Target Milestone: asyncKeywords: Reopened, Triaged, ZStream
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-2.0.0-41.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1364540 Environment:
Last Closed: 2016-12-21 16:51:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1364540, 1388546, 1394322    
Bug Blocks: 1337794    

Comment 1 Marios Andreou 2016-10-31 14:40:33 UTC
The full 'fix' here consists of two reviews - they were originally tracked as independent fixes for BZ 1364540 (the original ovs upgrade workaround) and then BZ 1388675 (added --replacepkgs incase ovs was already upgraded and fix ceph upgrade script syntax nit).  

The fixes are https://review.openstack.org/#/c/389753/ (needs to be cherrypicked to mitaka and liberty, already in newton) and https://review.openstack.org/#/c/390792/ (needs cherrypick to newton mitaka and liberty but master still in review. )

Comment 2 Marios Andreou 2016-11-04 17:33:28 UTC
This landed into stable/mitaka today so moving to POST (and linking the stable/mitaka reviews above)

Comment 4 Omri Hochman 2016-11-18 15:46:43 UTC
*** Bug 1396313 has been marked as a duplicate of this bug. ***

Comment 5 Omri Hochman 2016-11-18 15:49:44 UTC
This issue is blocking life-cycle's OSP10 mandatory scenario , which is  Update + Upgrade  9.0GA -> 9.0Async -> 10.0 latest

https://bugzilla.redhat.com/show_bug.cgi?id=1396313 

raising severity to be fixed before OSP10 release

Comment 6 Marios Andreou 2016-11-18 16:32:15 UTC
o/ mburns hello, I know how much you like bz spam, so here have some more:

I have an AI after Lifecycle scrum to ping you about this bug. Can we please have a build for OSP9 tripleo-heat-templates that includes this fix? The reviews linked have both landed to mitaka for a while now.

As Omri notes it is blocking us for 10 because we want update to latest 9 then upgrade, and sasha is hitting BZ 1396313 (which is this issue) in his tests.

Comment 11 Marios Andreou 2016-11-21 17:22:56 UTC
so discussed on today's scrum, sasha will verify and then we can move to onqa... @sasha please check mburns added a Fixed In Version: openstack-tripleo-heat-templates-2.0.0-39.el7ost → openstack-tripleo-heat-templates-2.0.0-40.el7ost

so make sure you grab the right version - looks like mburns landed this directly downstream over this weekend (it landed into mitaka a couple of weeks ago)

Comment 14 Alexander Chuzhoy 2016-11-22 01:53:38 UTC
Version:
openstack-tripleo-heat-templates-2.0.0-40.el7ost.noarch


So what happens now is that the controllers remained reachable and the non-controllers became unreachable during minor update.
The minor update failed.

01:42:17 IN_PROGRESS
01:42:17 IN_PROGRESS
01:42:17 IN_PROGRESS
01:42:17 IN_PROGRESS
01:42:17 IN_PROGRESS
01:42:17 IN_PROGRESS
01:42:17 ERROR: Authentication failed: Authentication required
01:42:17 There was an error running Run the OC update command. Exiting....

This is from a controller's yum.log:
[root@overcloud-controller-0 ~]# grep -i openvswitch /var/log/yum.log
Nov 21 21:52:09 Updated: python-openvswitch-2.5.0-14.git20160727.el7fdp.noarch
Nov 21 21:53:31 Updated: 1:openstack-neutron-openvswitch-8.1.2-12.el7ost.noarch

Comment 15 Marios Andreou 2016-11-22 11:04:15 UTC
(In reply to Alexander Chuzhoy from comment #14)
> Version:
> openstack-tripleo-heat-templates-2.0.0-40.el7ost.noarch
> 
> 
> So what happens now is that the controllers remained reachable and the
> non-controllers became unreachable during minor update.
> The minor update failed.
> 
> 01:42:17 IN_PROGRESS
> 01:42:17 IN_PROGRESS
> 01:42:17 IN_PROGRESS
> 01:42:17 IN_PROGRESS
> 01:42:17 IN_PROGRESS
> 01:42:17 IN_PROGRESS
> 01:42:17 ERROR: Authentication failed: Authentication required
> 01:42:17 There was an error running Run the OC update command. Exiting....
> 
> This is from a controller's yum.log:
> [root@overcloud-controller-0 ~]# grep -i openvswitch /var/log/yum.log
> Nov 21 21:52:09 Updated:
> python-openvswitch-2.5.0-14.git20160727.el7fdp.noarch
> Nov 21 21:53:31 Updated:
> 1:openstack-neutron-openvswitch-8.1.2-12.el7ost.noarch

thanks Sasha, so this shows that openvswitch was updated ^^^ for controller-0 but there isn't much more to go on here. "The minor update failed" do we have any logs? Is this environment still available? We may need to hand over to networking team if there is still a problem with the ovs upgrade on non controllers

Comment 16 Lukas Bezdicka 2016-11-22 17:46:05 UTC
Failed QA for good reason https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/yum_update.sh#L81

we need to handle the OVS case also on noncontrollers before exit0

Comment 17 Marios Andreou 2016-11-22 17:56:16 UTC
(In reply to Lukas Bezdicka from comment #16)
> Failed QA for good reason
> https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/
> tasks/yum_update.sh#L81
> 
> we need to handle the OVS case also on noncontrollers before exit0

thanks Lukas and Sasha - looks like a legit issue and a good catch @Sasha . I don't think we need to hand-over then to the networking team. This is something we can resolve ourselves. The workaround just needs to move up by a few lines so it gets executed the same on controllers/non.

Comment 20 Marios Andreou 2016-11-23 18:25:01 UTC
removing master and adding the mitaka cherrypick for the latest fix we need here @ https://review.openstack.org/#/c/401365/

Comment 21 Marios Andreou 2016-11-24 10:12:53 UTC
https://review.openstack.org/#/c/401365/ mitaka merged moving POST

Comment 26 Alexander Chuzhoy 2016-11-29 22:30:06 UTC
Verified:

Environment:
openstack-tripleo-heat-templates-2.0.0-41.el7ost.noarch


The controller is reachable and has IPs.
[root@overcloud-controller-0 ~]# grep openvswitch-2 /var/log/yum.log
Nov 29 21:27:44 Updated: python-openvswitch-2.5.0-14.git20160727.el7fdp.noarch

Comment 27 Marios Andreou 2016-11-30 08:09:30 UTC
(In reply to Alexander Chuzhoy from comment #26)
> Verified:
> 
> Environment:
> openstack-tripleo-heat-templates-2.0.0-41.el7ost.noarch
> 
> 
> The controller is reachable and has IPs.
> [root@overcloud-controller-0 ~]# grep openvswitch-2 /var/log/yum.log
> Nov 29 21:27:44 Updated:
> python-openvswitch-2.5.0-14.git20160727.el7fdp.noarch

@Sasha, just to be clear, it should be 'openvswitch-2.5.x...' you are pointing at python-openvswitch here ^^^ though they carry the same release version so I think they are updated together anyway (i.e. you very likely did get openvswitch 2.5 on that env)

e.g. 

[root@overcloud-controller-0 ~]# rpm -qa | grep openvswitch 
openstack-neutron-openvswitch-9.1.0-7.el7ost.noarch
openvswitch-2.5.0-14.git20160727.el7fdp.x86_64
python-openvswitch-2.5.0-14.git20160727.el7fdp.noarch
[root@overcloud-controller-0 ~]#

Comment 30 errata-xmlrpc 2016-12-21 16:51:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2983.html