Bug 1516867

Summary: OSP11 -> OSP12 upgrade: floating IP connectivity gets disrupted during major-upgrade-composable-steps-docker
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Sofer Athlan-Guyot <sathlang>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 12.0 (Pike)CC: dbecker, mandreou, mbracho, mburns, morazi, rhel-osp-director-maint, sathlang
Target Milestone: rcKeywords: Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-7.0.3-15.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-13 22:22:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2017-11-23 13:06:31 UTC
Description of problem:
OSP11 -> OSP12 upgrade: floating IP connectivity gets disrupted during major-upgrade-composable-steps-docker.

After the major upgrade composable step completes we can see the following ping results:

3991 packets transmitted, 2491 received, +1459 errors, 37% packet loss, time 3993054ms
rtt min/avg/max/mdev = 0.516/1.087/6.887/0.387 ms, pipe 4
Ping loss higher than 1% detected


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.3-12.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP11
2. Launch an instance and assign it a floating IP via a Neutron router
3. Start ping of the floating IP attached to the instance
4. Run major upgrade composable step docker 

Actual results:
The ping results show a high amount of packet loss.

Expected results:
A close to 0 percentage of packet loss should happen.

Additional info:
This is a regression introduced in the latest build.

Comment 1 Marius Cornea 2017-11-23 13:27:31 UTC
Snippet from os-collect-config journal on the networker node:

Nov 23 12:51:31 networker-0 os-collect-config[2953]: PLAY [localhost] ***************************************************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Gathering Facts] *********************************************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: ok: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Check if neutron_dhcp_agent is deployed] *********************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Check if neutron_l3_agent is deployed] ***********************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Check if neutron_metadata_agent is deployed] *****************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Check if neutron_ovs_agent is deployed] **********************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Install docker packages on upgrade if missing] ***************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: ok: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Check for os-net-config upgrade] *****************************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Check that os-net-config has configuration] ******************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Upgrade os-net-config] ***************************************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: skipping: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [take new os-net-config parameters into account now] **********************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: skipping: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [Update all packages] *****************************************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: TASK [blank ipv6 rule before activating ipv6 firewall.] ************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: changed: [localhost]
Nov 23 12:51:31 networker-0 os-collect-config[2953]: PLAY RECAP *********************************************************************
Nov 23 12:51:31 networker-0 os-collect-config[2953]: localhost                  : ok=10   changed=8    unreachable=0    failed=0

We can see that the block for special case of os-net-config upgrade got skipped:

https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-packages.yaml#L66-L75

Comment 6 errata-xmlrpc 2017-12-13 22:22:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462