1431108 – ovs 2.5 to 2.6 tripleo upgrade&update need to special case openvswitch upgrade

Bug 1431108 - ovs 2.5 to 2.6 tripleo upgrade&update need to special case openvswitch upgrade

Summary: ovs 2.5 to 2.6 tripleo upgrade&update need to special case openvswitch upgrade

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	11.0 (Ocata)
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	11.0 (Ocata)
Assignee:	Marios Andreou
QA Contact:	Marius Cornea
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1431115
TreeView+	depends on / blocked

Reported:	2017-03-10 12:12 UTC by Marios Andreou
Modified:	2017-05-17 20:06 UTC (History)
CC List:	9 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-6.0.0-4.el7ost puppet-vswitch-6.3.0-2.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1431115 (view as bug list)
Environment:
Last Closed:	2017-05-17 20:06:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1669714	None	None	None	2017-03-10 12:14:54 UTC
OpenStack gerrit	441184	None	None	None	2017-03-10 12:16:05 UTC
OpenStack gerrit	451231	None	None	None	2017-04-06 10:56:55 UTC
OpenStack gerrit	452524	None	None	None	2017-04-02 20:41:12 UTC
OpenStack gerrit	453197	None	None	None	2017-04-04 15:53:12 UTC
Red Hat Product Errata	RHEA-2017:1245	normal	SHIPPED_LIVE	Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory	2017-05-17 23:01:50 UTC

Description Marios Andreou 2017-03-10 12:12:45 UTC

Description of problem:

As described in the upstream bug at https://bugs.launchpad.net/tripleo/+bug/1669714 wherever openvswitch 2.6 becomes available, any attempt to perform major upgrade or minor update on those nodes is subject to node connectivity problems and ultimately need reboot. We need to remove the special case handling that was previously required when we were going from 2.4 to 2.5.

Comment 2 Marios Andreou 2017-03-28 09:00:31 UTC

Update: seems like we should *still* carry a special case upgrade for openvswitch and specifically ovs 2.5.0-14 - I've decided to use the same bug in an attempt to minimize the inevitable confusion here :(

Please see the discussion at https://bugzilla.redhat.com/show_bug.cgi?id=1424945#c11 for more information but essentially the workaround is the same as the one we previously had, with the addition of the '--notriggerun' flag for the package update.

I have just posted https://review.openstack.org/450607 Add special case upgrade from openvswitch 2.5.0-14 and matbu has this for the ansible steps at https://review.openstack.org/#/c/434346/

moving back to assigned.

Comment 3 Marios Andreou 2017-03-30 14:27:48 UTC

just changed the title... the bug was originally tracking removal of openvswitch workaround, which we did. Now we are using the same bug to re-add the workaround along with the extra flag.

Comment 4 Sofer Athlan-Guyot 2017-04-04 08:26:31 UTC

Adding puppet-vswitch patch that ensure puppet is working with dpdk openvswitch 2.6.

Comment 5 Sofer Athlan-Guyot 2017-04-04 15:53:12 UTC

Point to puppet-vswitch ocata.

Comment 6 Sofer Athlan-Guyot 2017-04-06 10:56:56 UTC

Point openvswitch exception to stable/ocata.

Comment 7 Sofer Athlan-Guyot 2017-04-07 08:02:47 UTC

Everything merged in stable/ocata.

Comment 12 Marius Cornea 2017-04-24 09:58:12 UTC

I would like to verify this bug but it's not clear to me what's the proper way to do the verification. During OSP10->11 upgrade we should run 'rpm -U --replacepkgs --notriggerun --nopostun $ovs_package' as we're upgrading from 2.5.0-14. Is it enough to check that this command was run during upgrade and instances connectivity is not disrupted? Is there any additional step that needs to be run to make sure openvswitch was upgraded correctly? Thanks!

Comment 13 Marios Andreou 2017-04-24 10:00:53 UTC

hey marius, yeah checking that command was executed would be great. Really though the verification here is the absence of any network/interface issues during the upgrade - would be great to also confirm before/after versions of openvswitch (i.e. i had no issues going from ovs2.5.x to 2.6.x and afaics it ran the --nopostun rpm install would be ideal

will leave the needinfo incase network team wants to add to this

Comment 15 Marius Cornea 2017-04-25 17:59:03 UTC

On controllers after running major-upgrade-composable-steps.yaml:

[root@overcloud-controller-0 ~]# rpm -qa | grep ^openvswitch
openvswitch-2.6.1-10.git20161206.el7fdp.x86_64

Checking the yum.log we can see that it hasn't been update via yum:

[root@overcloud-controller-0 ~]# grep openvswitch /var/log/yum.log 
Apr 25 16:22:40 Updated: python-openvswitch-2.6.1-10.git20161206.el7fdp.noarch
Apr 25 16:23:30 Updated: 1:openstack-neutron-openvswitch-10.0.1-1.el7ost.noarch

[root@overcloud-controller-0 ~]# ls /root/OVS_UPGRADE/openvswitch-2.6.1-10.git20161206.el7fdp.x86_64.rpm 
/root/OVS_UPGRADE/openvswitch-2.6.1-10.git20161206.el7fdp.x86_64.rpm

OVS 2.5 is still loaded:

[root@overcloud-controller-0 ~]# ovs-vsctl show | grep ovs_version
    ovs_version: "2.5.0"

No network connectivity issues showed up during this step.

After one of the controller reboot we can see the new OVS version is loaded:
[root@overcloud-controller-1 heat-admin]# ovs-vsctl show | grep ovs_version
    ovs_version: "2.6.1"

Tunnels are set up: http://paste.openstack.org/show/607894/

All the agents are up:
http://paste.openstack.org/show/607895/

On compute node we can see in the log the special case upgrade of openvswitch:

Tue Apr 25 13:22:59 EDT 2017 upgrade-non-controller.sh Executing /root/tripleo_upgrade_node.sh on 192.168.0.21
 "nova_compute",
openvswitch-2.5.0-14.git20160727.el7fdp.x86_64
Manual upgrade of openvswitch - ovs-2.5.0-14 or restart in postun detected
/home/heat-admin/OVS_UPGRADE /home/heat-admin
Attempting to downloading latest openvswitch with yumdownloader
Loaded plugins: product-id
Repository rhelosp-fdp-pending is listed more than once in the configuration
--> Running transaction check
---> Package openvswitch.x86_64 0:2.6.1-10.git20161206.el7fdp will be installed
--> Finished Dependency Resolution
Updating openvswitch-2.6.1-10.git20161206.el7fdp.x86_64.rpm with --nopostun --notriggerun
/home/heat-admin

Once upgrade has finished:

[root@overcloud-compute-0 ~]# rpm -qa | grep ^openvswitch
openvswitch-2.6.1-10.git20161206.el7fdp.x86_64
o[root@overcloud-compute-0 ~]# ovs-vsctl show | grep ovs_version
    ovs_version: "2.5.0"

Instance running on this node is still reachable.

After compute node reboot:
[root@overcloud-compute-0 heat-admin]# ovs-vsctl show | grep ovs_version
    ovs_version: "2.6.1"

The compute node can take new workloads which are reachable.

Agents look good on all nodes:
http://paste.openstack.org/show/607899/

Given that I wasn't able to hit any issues related to the openvswitch package upgrade I am moving this bug to verified state.

Comment 16 errata-xmlrpc 2017-05-17 20:06:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245

Note You need to log in before you can comment on or make changes to this bug.