Bug 1279652 - Overcloud instances lose floating IP connectivity during update from 7.0 to 7.1
Summary: Overcloud instances lose floating IP connectivity during update from 7.0 to 7.1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: y2
: 7.0 (Kilo)
Assignee: James Slagle
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-09 23:49 UTC by Marius Cornea
Modified: 2015-12-21 16:58 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Orphaned OpenStack Networking L3 agent keepalived processes were left running by OpenStack Networking's "netns-cleanup" script. As a result, the OpenStack Networking tenant router failover did not work during the Controller node update in the Overcloud. This fix ensures the keepalived processes are cleaned up properly during the Controller node update. As a result, OpenStack Networking tenant router failover works normally and the high availability of the tenant network is preserved.
Clone Of:
Environment:
Last Closed: 2015-12-21 16:58:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
controller0 sosreport (14.11 MB, application/x-xz)
2015-11-09 23:54 UTC, Marius Cornea
no flags Details
controller1 sosreport (13.98 MB, application/x-xz)
2015-11-09 23:56 UTC, Marius Cornea
no flags Details
controller2 (13.55 MB, application/x-xz)
2015-11-09 23:57 UTC, Marius Cornea
no flags Details
update.yaml (2.23 KB, text/plain)
2015-11-09 23:58 UTC, Marius Cornea
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2651 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OSP 7 director Bug Fix Advisory 2015-12-21 21:50:26 UTC

Description Marius Cornea 2015-11-09 23:49:27 UTC
Description of problem:
During update from 7.0 to 7.1 on HA deployment(3 x ctrls + 1 x compute) with network isolation the overcloud instances lose connectivity via their floating IPs.

Steps to Reproduce:
1. Deploy 7.0
openstack overcloud deploy --templates ~/templates-7.0/my-overcloud -e ~/templates-7.0/my-overcloud/environments/network-isolation.yaml -e ~/templates-7.0/network-environment.yaml  --control-scale 3 --compute-scale 1 --ntp-server clock.redhat.com --libvirt-type qemu

2. Create external network, tenant network, router on the overcloud network.

3. Boot instance and assign it a floating IP on the external network 

4. Update undercloud to 7.1

5. Update stack
openstack overcloud update stack overcloud -i --templates ~/templates-7.1/my-overcloud -e ~/templates-7.1/my-overcloud/overcloud-resource-registry-puppet.yaml -e ~/templates-7.1/my-overcloud/environments/network-isolation.yaml -e ~/templates-7.1/network-environment.yaml  -e ~/templates-7.1/update.yaml

6. Check connectivity to the instance floating IP

Actual results:
There's no connectivity during the update.

Expected results:
The instance keeps connectivity during the update.

Additional info:
I see the l3 agent ha_state in different states during update: active on 2 nodes, active on all of the nodes.

At the end the update fails and there's a single alive l3 agent but it's got a standby ha_state. 

stack@instack:~>>> neutron l3-agent-list-hosting-router tenant-router
+--------------------------------------+------------------------------------+----------------+-------+----------+
| id                                   | host                               | admin_state_up | alive | ha_state |
+--------------------------------------+------------------------------------+----------------+-------+----------+
| aef76e90-1157-46cf-a805-2059975639f8 | overcloud-controller-1.localdomain | True           | :-)   | active   |
| 2b195c24-65a1-4923-a506-5dba365d7d5c | overcloud-controller-0.localdomain | True           | :-)   | standby  |
| f8e2331e-fc16-4c7b-a8f3-4a51fd0e3406 | overcloud-controller-2.localdomain | True           | :-)   | active   |
+--------------------------------------+------------------------------------+----------------+-------+----------+
stack@instack:~>>> neutron l3-agent-list-hosting-router tenant-router
+--------------------------------------+------------------------------------+----------------+-------+----------+
| id                                   | host                               | admin_state_up | alive | ha_state |
+--------------------------------------+------------------------------------+----------------+-------+----------+
| aef76e90-1157-46cf-a805-2059975639f8 | overcloud-controller-1.localdomain | True           | :-)   | active   |
| 2b195c24-65a1-4923-a506-5dba365d7d5c | overcloud-controller-0.localdomain | True           | :-)   | active   |
| f8e2331e-fc16-4c7b-a8f3-4a51fd0e3406 | overcloud-controller-2.localdomain | True           | :-)   | active   |
+--------------------------------------+------------------------------------+----------------+-------+----------+
stack@instack:~>>> neutron l3-agent-list-hosting-router tenant-router
+--------------------------------------+------------------------------------+----------------+-------+----------+
| id                                   | host                               | admin_state_up | alive | ha_state |
+--------------------------------------+------------------------------------+----------------+-------+----------+
| aef76e90-1157-46cf-a805-2059975639f8 | overcloud-controller-1.localdomain | True           | :-)   | standby  |
| 2b195c24-65a1-4923-a506-5dba365d7d5c | overcloud-controller-0.localdomain | True           | xxx   | active   |
| f8e2331e-fc16-4c7b-a8f3-4a51fd0e3406 | overcloud-controller-2.localdomain | True           | xxx   | standby  |
+--------------------------------------+------------------------------------+----------------+-------+----------+

Comment 2 Marius Cornea 2015-11-09 23:54:06 UTC
Created attachment 1092013 [details]
controller0 sosreport

Comment 3 Marius Cornea 2015-11-09 23:56:02 UTC
Created attachment 1092014 [details]
controller1 sosreport

Comment 4 Marius Cornea 2015-11-09 23:57:28 UTC
Created attachment 1092015 [details]
controller2

Comment 5 Marius Cornea 2015-11-09 23:58:34 UTC
Created attachment 1092019 [details]
update.yaml

Comment 7 Marius Cornea 2015-12-15 13:49:26 UTC
Results for a ping during update: 

--- 172.16.23.111 ping statistics ---
5409 packets transmitted, 5393 received, 0% packet loss, time 5413793ms
rtt min/avg/max/mdev = 0.815/1.727/7.237/0.378 ms

Comment 8 James Slagle 2015-12-15 21:20:42 UTC
hi, the doc text for this one would be the same as https://bugzilla.redhat.com/show_bug.cgi?id=1285079. I've copied here too as well.

Comment 10 errata-xmlrpc 2015-12-21 16:58:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:2651


Note You need to log in before you can comment on or make changes to this bug.