Bug 1149777

Summary: host=<ID> is not set in neutron-HA mode, thus network-node failovers will not work
Product: Red Hat OpenStack Reporter: Miguel Angel Ajo <majopela>
Component: openstack-foreman-installerAssignee: Brad P. Crochet <brad>
Status: CLOSED ERRATA QA Contact: Ofer Blaut <oblaut>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.0 (RHEL 6)CC: amuller, brad, dnavale, fdinitto, jguiditt, lpeer, mburns, morazi, rhos-maint, yeylon
Target Milestone: z2Keywords: ZStream
Target Release: Installer   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-foreman-installer-2.0.29-1.el6ost Doc Type: Bug Fix
Doc Text:
Previously, OpenStack Networking HA configuration did not set the host=<id> parameter. As a result, failover would not occur for the OpenStack Networking agents. With this update, the host=<id> parameter is set on all controllers and as a result, failover happens correctly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-11-04 17:03:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1150413    
Bug Blocks:    

Description Miguel Angel Ajo 2014-10-06 15:32:14 UTC
Description of problem:

   Neutron agents in HA mode, should share the same host ID (set in [DEFAULT]/host=same-host-id) across all the 3 nodes.

   This is used, so, when a node which has neutron agents fails over to another host, will pick up all the resources (virtual routers, dhcp, lbbas instances) that were assigned to the same host id.

   By default host id is the FQDN of the host (macxxxxxxxxxx.domain.com).

   Without this set, if you failover a host where the agents are running, those will be restarted in another, but won't setup any resource, thus, dhcp, routing, etc will be broken.(those would need to be manually migrated via dhcp-agent-network-remove / dhcp-agent-network-add, l3-agent-router-remove/l3-agent-router-add).

Version-Release number of selected component (if applicable):
foreman-installer-1.6.0-0.2.RC1.el6ost

How reproducible:

always

Steps to Reproduce:
1.install
2.check on the 3 HA controller nodes:  # grep ^host /etc/neutron/*
3.another way to check: ]# neutron agent-list


This shouldn't show individually registered hosts as agents (which will be
populated when we failover all over the hosts). and instead of showing hosts as FQDN's there should be a fixed host "virtual id"

Actual results:
# grep ^host /etc/neutron/* 
in HA controllers is empty

[root@mac000af77993ce ~(openstack_admin)]# neutron agent-list
+--------------------------------------+--------------------+-----------------------------+-------+----------------+
| id                                   | agent_type         | host                        | alive | admin_state_up |
+--------------------------------------+--------------------+-----------------------------+-------+----------------+
| 388d9178-f226-48f4-b81d-a2d96a449571 | DHCP agent         | mac000af77993ce.example.com | xxx   | True           |
| 4c0646c6-f2da-4b12-a53b-48784a7fdd4b | DHCP agent         | mac000af7799318.example.com | :-)   | True           |
| 545f7f25-0413-4e3c-a276-e7be3be2421a | Metadata agent     | mac000af7799318.example.com | :-)   | True           |
| 54b07c0a-d6fc-4ad2-9fa1-f9955b4c8251 | L3 agent           | mac000af77993ce.example.com | xxx   | True           |
| 583d6112-89b9-4e7d-9258-e96d8a3a04db | L3 agent           | mac000af779931a.example.com | xxx   | True           |
| 63a49aea-3ea0-45a7-88ba-ab9397481503 | Metadata agent     | mac000af77993ce.example.com | xxx   | True           |
| 71dc94b6-79cb-473a-bed6-a6afdc84db91 | DHCP agent         | mac000af779931a.example.com | xxx   | True           |
| 72f349a7-a468-4003-8bc3-3e567bf2b84c | Open vSwitch agent | mac000af779932a.example.com | :-)   | True           |
| 7ed67260-ea17-4a24-bacc-e5689a917ac8 | Open vSwitch agent | mac000af7799318.example.com | :-)   | True           |
| 8ddb0aae-5425-45b3-a110-880f3d20e286 | Open vSwitch agent | mac000af77993ce.example.com | xxx   | True           |
| a1289d63-6283-41f9-bf7d-ad4cd533b31a | Open vSwitch agent | mac000af779931a.example.com | xxx   | True           |
| cbe064b1-f3c2-42c9-be3a-cda44385c18a | Metadata agent     | mac000af779931a.example.com | xxx   | True           |
| deaa6571-cc8e-48fc-85bd-44f04def7e14 | L3 agent           | mac000af7799318.example.com | :-)   | True           |
+--------------------------------------+--------------------+-----------------------------+-------+----------------+

Expected results:
# grep ^host /etc/neutron/*
should be set to anything.

[root@mac000af77993ce ~(openstack_admin)]# neutron agent-list
+--------------------------------------+--------------------+-----------------------------+-------+----------------+
| id                                   | agent_type         | host                        | alive | admin_state_up |
+--------------------------------------+--------------------+-----------------------------+-------+----------------+
| 583d6112-89b9-4e7d-9258-e96d8a3a04db | L3 agent           | neutron-n                   | :-)   | True           |
| 71dc94b6-79cb-473a-bed6-a6afdc84db91 | DHCP agent         | neutron-n                   | :-)   | True           |
| 72f349a7-a468-4003-8bc3-3e567bf2b84c | Open vSwitch agent | mac000af779932a.example.com | :-)   | True           |
| a1289d63-6283-41f9-bf7d-ad4cd533b31a | Open vSwitch agent | neutron-n                   | :-)   | True           |
| cbe064b1-f3c2-42c9-be3a-cda44385c18a | Metadata agent     | neutron-n                   | :-)   | True           |
+--------------------------------------+--------------------+-----------------------------+-------+----------------+

Additional info:
We are looking into what was going to be an alternative solution to same-id across all nodes, but, which could be the standard solution from the start.

In that solution we just instruct pacemaker to set the host= parameter dynamically, so we will have 2A + 1P node by default. And the virtual host id will be roamed from the failed host to the new host.

So, in other conditions I would be asking to set the host= parameters via puppet to the config files, but instead, the proposal is to use the new neutron-scale ocf script to fix the issue.

Comment 7 Brad P. Crochet 2014-10-08 17:45:15 UTC
PR available: https://github.com/redhat-openstack/astapor/pull/387

Comment 8 Jason Guiditta 2014-10-09 16:48:23 UTC
Merged.  Tested with Miguel, appears to work as desired.

Comment 11 Miguel Angel Ajo 2014-10-10 09:04:36 UTC
(In reply to Jason Guiditta from comment #8)
> Merged.  Tested with Miguel, appears to work as desired.
Correct, it looks good.

The agent registration all happened like neutron-n-0, and
agents could be failed over to other hosts, keeping the same
id, thus, the same associated resources.

Comment 14 Ofer Blaut 2014-11-02 14:15:51 UTC
Tested on :rhel-osp-installer-0.4.7-1.el6ost.noarch

Last login: Sun Nov  2 15:33:48 2014 from 192.168.0.1
[root@mac441ea173385f ~]# grep ^host /etc/neutron/*
/etc/neutron/neutron.conf:host = neutron-n-0



[root@mac441ea173385f ~]# source keystonerc_admin 
[root@mac441ea173385f ~(openstack_admin)]# neutron agent-list
+--------------------------------------+--------------------+-----------------------------+-------+----------------+
| id                                   | agent_type         | host                        | alive | admin_state_up |
+--------------------------------------+--------------------+-----------------------------+-------+----------------+
| 44abf1d3-39d6-435a-9063-3e2f9110b0a1 | DHCP agent         | neutron-n-0                 | :-)   | True           |
| 84deec32-93df-4c9d-a5eb-5120c5430216 | Open vSwitch agent | mac441ea1733991.example.com | :-)   | True           |
| 894e07ba-ee0d-4939-b6d1-f3207ceb82ba | Open vSwitch agent | neutron-n-0                 | :-)   | True           |
| ad1cf899-3614-4da5-8cdf-ee9a006c5e32 | Metadata agent     | neutron-n-0                 | :-)   | True           |
| f7a2f321-97ea-47a0-aa26-851823a2eda4 | L3 agent           | neutron-n-0                 | :-)   | True           |
+--------------------------------------+--------------------+-----------------------------+-------+----------------+

Comment 16 errata-xmlrpc 2014-11-04 17:03:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1800.html

Comment 17 Miguel Angel Ajo 2014-12-05 10:46:35 UTC
As additional details, the host parameter should be set like this:

openstack-config --set /etc/neutron.conf DEFAULT host neutron-n-0 

on all three HA controller nodes manually.

afterwards, if there were running routers or dhcp services for tenant
networks, those would need to be rescheduled to the new agent "neutron-n-0"

(check via neutron agent-list)


 then, you should use the ....

   l3-agent-router-add            Add a router to a L3 agent.
   l3-agent-router-remove         Remove a router from a L3 agent.

 commands to reschedule routers, 

 and the...

   dhcp-agent-list-hosting-net    List DHCP agents hosting a network.
   dhcp-agent-network-add         Add a network to a DHCP agent.
   dhcp-agent-network-remove      Remove a network from a DHCP agent.
   net-list-on-dhcp-agent         List the networks on a DHCP agent.

 to reschedule dhcp services to the new agent

Comment 18 Brad P. Crochet 2018-09-26 12:06:37 UTC
Clearing need info