Bug 1279652 - Overcloud instances lose floating IP connectivity during update from 7.0 to 7.1
Overcloud instances lose floating IP connectivity during update from 7.0 to 7.1
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
7.0 (Kilo)
Unspecified Unspecified
high Severity high
: y2
: 7.0 (Kilo)
Assigned To: James Slagle
Marius Cornea
: TestOnly, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-09 18:49 EST by Marius Cornea
Modified: 2015-12-21 11:58 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Orphaned OpenStack Networking L3 agent keepalived processes were left running by OpenStack Networking's "netns-cleanup" script. As a result, the OpenStack Networking tenant router failover did not work during the Controller node update in the Overcloud. This fix ensures the keepalived processes are cleaned up properly during the Controller node update. As a result, OpenStack Networking tenant router failover works normally and the high availability of the tenant network is preserved.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-21 11:58:02 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
controller0 sosreport (14.11 MB, application/x-xz)
2015-11-09 18:54 EST, Marius Cornea
no flags Details
controller1 sosreport (13.98 MB, application/x-xz)
2015-11-09 18:56 EST, Marius Cornea
no flags Details
controller2 (13.55 MB, application/x-xz)
2015-11-09 18:57 EST, Marius Cornea
no flags Details
update.yaml (2.23 KB, text/plain)
2015-11-09 18:58 EST, Marius Cornea
no flags Details

  None (edit)
Description Marius Cornea 2015-11-09 18:49:27 EST
Description of problem:
During update from 7.0 to 7.1 on HA deployment(3 x ctrls + 1 x compute) with network isolation the overcloud instances lose connectivity via their floating IPs.

Steps to Reproduce:
1. Deploy 7.0
openstack overcloud deploy --templates ~/templates-7.0/my-overcloud -e ~/templates-7.0/my-overcloud/environments/network-isolation.yaml -e ~/templates-7.0/network-environment.yaml  --control-scale 3 --compute-scale 1 --ntp-server clock.redhat.com --libvirt-type qemu

2. Create external network, tenant network, router on the overcloud network.

3. Boot instance and assign it a floating IP on the external network 

4. Update undercloud to 7.1

5. Update stack
openstack overcloud update stack overcloud -i --templates ~/templates-7.1/my-overcloud -e ~/templates-7.1/my-overcloud/overcloud-resource-registry-puppet.yaml -e ~/templates-7.1/my-overcloud/environments/network-isolation.yaml -e ~/templates-7.1/network-environment.yaml  -e ~/templates-7.1/update.yaml

6. Check connectivity to the instance floating IP

Actual results:
There's no connectivity during the update.

Expected results:
The instance keeps connectivity during the update.

Additional info:
I see the l3 agent ha_state in different states during update: active on 2 nodes, active on all of the nodes.

At the end the update fails and there's a single alive l3 agent but it's got a standby ha_state. 

stack@instack:~>>> neutron l3-agent-list-hosting-router tenant-router
+--------------------------------------+------------------------------------+----------------+-------+----------+
| id                                   | host                               | admin_state_up | alive | ha_state |
+--------------------------------------+------------------------------------+----------------+-------+----------+
| aef76e90-1157-46cf-a805-2059975639f8 | overcloud-controller-1.localdomain | True           | :-)   | active   |
| 2b195c24-65a1-4923-a506-5dba365d7d5c | overcloud-controller-0.localdomain | True           | :-)   | standby  |
| f8e2331e-fc16-4c7b-a8f3-4a51fd0e3406 | overcloud-controller-2.localdomain | True           | :-)   | active   |
+--------------------------------------+------------------------------------+----------------+-------+----------+
stack@instack:~>>> neutron l3-agent-list-hosting-router tenant-router
+--------------------------------------+------------------------------------+----------------+-------+----------+
| id                                   | host                               | admin_state_up | alive | ha_state |
+--------------------------------------+------------------------------------+----------------+-------+----------+
| aef76e90-1157-46cf-a805-2059975639f8 | overcloud-controller-1.localdomain | True           | :-)   | active   |
| 2b195c24-65a1-4923-a506-5dba365d7d5c | overcloud-controller-0.localdomain | True           | :-)   | active   |
| f8e2331e-fc16-4c7b-a8f3-4a51fd0e3406 | overcloud-controller-2.localdomain | True           | :-)   | active   |
+--------------------------------------+------------------------------------+----------------+-------+----------+
stack@instack:~>>> neutron l3-agent-list-hosting-router tenant-router
+--------------------------------------+------------------------------------+----------------+-------+----------+
| id                                   | host                               | admin_state_up | alive | ha_state |
+--------------------------------------+------------------------------------+----------------+-------+----------+
| aef76e90-1157-46cf-a805-2059975639f8 | overcloud-controller-1.localdomain | True           | :-)   | standby  |
| 2b195c24-65a1-4923-a506-5dba365d7d5c | overcloud-controller-0.localdomain | True           | xxx   | active   |
| f8e2331e-fc16-4c7b-a8f3-4a51fd0e3406 | overcloud-controller-2.localdomain | True           | xxx   | standby  |
+--------------------------------------+------------------------------------+----------------+-------+----------+
Comment 2 Marius Cornea 2015-11-09 18:54 EST
Created attachment 1092013 [details]
controller0 sosreport
Comment 3 Marius Cornea 2015-11-09 18:56 EST
Created attachment 1092014 [details]
controller1 sosreport
Comment 4 Marius Cornea 2015-11-09 18:57 EST
Created attachment 1092015 [details]
controller2
Comment 5 Marius Cornea 2015-11-09 18:58 EST
Created attachment 1092019 [details]
update.yaml
Comment 7 Marius Cornea 2015-12-15 08:49:26 EST
Results for a ping during update: 

--- 172.16.23.111 ping statistics ---
5409 packets transmitted, 5393 received, 0% packet loss, time 5413793ms
rtt min/avg/max/mdev = 0.815/1.727/7.237/0.378 ms
Comment 8 James Slagle 2015-12-15 16:20:42 EST
hi, the doc text for this one would be the same as https://bugzilla.redhat.com/show_bug.cgi?id=1285079. I've copied here too as well.
Comment 10 errata-xmlrpc 2015-12-21 11:58:02 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:2651

Note You need to log in before you can comment on or make changes to this bug.