Description of problem: On an HA setup 3 controllers 1 compute created a network, subnet, router and booted an instance. When I restart the openstack-neutron-l3-agent container or even hard reset of the controller can also cause the L3 HA to fail by more then one active router: (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 82e713a5-2e85-47fe-9bca-d55992b12d9c | controller-2.localdomain | True | :-) | standby | | 3915044a-2b8a-4141-81ae-c6b300ab2a82 | controller-0.localdomain | True | :-) | standby | | 0fccf79a-8e84-45e5-9249-e990e45a066f | controller-1.localdomain | True | :-) | active | +--------------------------------------+--------------------------+----------------+-------+----------+ root@controller-1 ~]# docker ps -a | grep l3 9a4e71fec16f 192.168.24.1:8787/rhosp13/openstack-neutron-l3-agent:2018-01-26.3 "kolla_start" 5 days ago Up 4 days (healthy) neutron_l3_agent [root@controller-1 ~]# docker restart 9a4e71fec16f 9a4e71fec16f (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 82e713a5-2e85-47fe-9bca-d55992b12d9c | controller-2.localdomain | True | :-) | active | | 3915044a-2b8a-4141-81ae-c6b300ab2a82 | controller-0.localdomain | True | :-) | standby | | 0fccf79a-8e84-45e5-9249-e990e45a066f | controller-1.localdomain | True | :-) | active | +--------------------------------------+--------------------------+----------------+-------+----------+ Version-Release number of selected component (if applicable): openstack-neutron-common-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch openstack-neutron-l2gw-agent-11.0.1-0.20180119003013.2a0f243.el7ost.noarch puppet-neutron-12.2.0-0.20180123133228.7428f68.el7ost.noarch openstack-neutron-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch openstack-neutron-sriov-nic-agent-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch python-neutron-lbaas-12.0.0-0.20180123055730.0dc985e.el7ost.noarch openstack-neutron-metering-agent-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch openstack-neutron-openvswitch-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch python2-neutronclient-6.6.0-0.20171215103134.50b5b29.el7ost.noarch python2-neutron-lib-1.12.0-0.20180112024322.cd07c7b.el7ost.noarch openstack-neutron-ml2-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch openstack-neutron-lbaas-12.0.0-0.20180123055730.0dc985e.el7ost.noarch openstack-neutron-lbaas-ui-4.0.0-0.20180123181022.17a57d9.el7ost.noarch openstack-neutron-linuxbridge-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch python-neutron-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch openstack-tripleo-heat-templates-8.0.0-0.20180122224016.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy HA setup with containers 2. Populate OC with network and a router 3. restart the l3 container Actual results: router set to active on more then one controller Expected results: router will remain active only on one controller Additional info:
Hi Udi L3 agent will update HA router status, 1) to "standby" [1] on agent start. Then after syncing all routers, starting keeplaived processes and wiring the router ports, l3 agent will again update the new status. 2) periodically(through periodic task) if the router status with the agent not matching with the DB. So please check if the l3 agent started successfully before checking the router status. [1] https://review.openstack.org/#/c/522641/
(In reply to anil venkata from comment #4) > Hi Udi > > L3 agent will update HA router status, > 1) to "standby" [1] on agent start. Then after syncing all routers, starting > keeplaived processes and wiring the router ports, l3 agent will again update > the new status. > 2) periodically(through periodic task) if the router status with the agent > not matching with the DB. > > So please check if the l3 agent started successfully before checking the > router status. > > [1] https://review.openstack.org/#/c/522641/ Hi Anil, Thanks for the information. I'll check the status in addition to waiting time Bernard suggested. Thanks, Udi
Issue still remain after 6 min and all agents are up. Setup is available for more testing - seal35.qa.lab.tlv.redhat.com root/1-8 (overcloud) [stack@undercloud-0 openstack]$ neutron agent-list neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | id | agent_type | host | availability_zone | alive | admin_state_up | binary | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | 0de9d54e-ce27-4b90-a534-a19237b594ec | DHCP agent | controller-0.localdomain | nova | :-) | True | neutron-dhcp-agent | | 2e2a40ba-ddc8-4601-804d-fd330e70e3d2 | Open vSwitch agent | compute-0.localdomain | | :-) | True | neutron-openvswitch-agent | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | L3 agent | controller-0.localdomain | nova | :-) | True | neutron-l3-agent | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | L3 agent | controller-1.localdomain | nova | :-) | True | neutron-l3-agent | | 5990e4e7-4761-4eae-9e8a-b24c259eea80 | Open vSwitch agent | controller-1.localdomain | | :-) | True | neutron-openvswitch-agent | | 5af4ab0c-b5e3-4ddd-a8a1-77bd9f487755 | DHCP agent | controller-2.localdomain | nova | :-) | True | neutron-dhcp-agent | | 69bde103-3f7d-4b91-ab24-e2cf5a509953 | DHCP agent | controller-1.localdomain | nova | :-) | True | neutron-dhcp-agent | | 6a6b3908-a08a-420e-abc2-5313d9c8834c | Metadata agent | controller-1.localdomain | | :-) | True | neutron-metadata-agent | | 6b0e7dde-4a83-4faf-9026-d0dc0074d99d | Open vSwitch agent | controller-2.localdomain | | :-) | True | neutron-openvswitch-agent | | 72cd089c-985f-4673-833c-7fc07e4a18f1 | Metadata agent | controller-0.localdomain | | :-) | True | neutron-metadata-agent | | 774a22c2-c5f1-408d-8e96-7056a545d408 | L3 agent | controller-2.localdomain | nova | :-) | True | neutron-l3-agent | | 8a1ee255-3a20-4f1a-855c-b98a6f5488c1 | Open vSwitch agent | controller-0.localdomain | | :-) | True | neutron-openvswitch-agent | | ef97f2e0-c55e-4401-bade-226fbe2ad8d3 | Metadata agent | controller-2.localdomain | | :-) | True | neutron-metadata-agent | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ (overcloud) [stack@undercloud-0 openstack]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 774a22c2-c5f1-408d-8e96-7056a545d408 | controller-2.localdomain | True | :-) | active | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | controller-1.localdomain | True | :-) | standby | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | controller-0.localdomain | True | :-) | standby | +--------------------------------------+--------------------------+----------------+-------+----------+ (overcloud) [stack@undercloud-0 openstack]$ openstack server list +--------------------------------------+-------------+--------+-----------------------------------------+--------------+----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-------------+--------+-----------------------------------------+--------------+----------+ | 0e096ed1-d559-41c6-bc6b-7228cbe72c67 | ciross-7512 | ACTIVE | internal-net=192.168.100.12, 10.0.0.211 | cirros-0.3.5 | m1.small | +--------------------------------------+-------------+--------+-----------------------------------------+--------------+----------+ (overcloud) [stack@undercloud-0 ~]$ . stackrc (undercloud) [stack@undercloud-0 ~]$ openstack server list +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | 6caca4c4-291a-4575-83ef-40d8ce7f6210 | controller-1 | ACTIVE | ctlplane=192.168.24.7 | overcloud-full | controller | | 1da42a55-ed53-433e-a415-0159a50074f3 | controller-2 | ACTIVE | ctlplane=192.168.24.15 | overcloud-full | controller | | 47ab4128-ebb4-431e-8270-1a20b1db683a | compute-0 | ACTIVE | ctlplane=192.168.24.13 | overcloud-full | compute | | d2cd555e-458f-4257-8c4d-f5a2f2345a6d | controller-0 | ACTIVE | ctlplane=192.168.24.9 | overcloud-full | controller | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ (undercloud) [stack@undercloud-0 ~]$ ssh heat-admin.24.15 Last login: Wed Mar 7 11:20:04 2018 from 192.168.24.254 [heat-admin@controller-2 ~]$ sudo -i [root@controller-2 ~]# docker ps | grep l3 3ada9d0a68ab 192.168.24.1:8787/rhosp13/openstack-neutron-l3-agent:2018-03-02.2 "kolla_start" 26 minutes ago Up 26 minutes (healthy) neutron_l3_agent [root@controller-2 ~]# docker restart neutron_l3_agent neutron_l3_agent [root@controller-2 ~]# docker ps | grep l3 3ada9d0a68ab 192.168.24.1:8787/rhosp13/openstack-neutron-l3-agent:2018-03-02.2 "kolla_start" 26 minutes ago Up 5 seconds (health: starting) neutron_l3_agent [root@controller-2 ~]# logout [heat-admin@controller-2 ~]$ logout Connection to 192.168.24.15 closed. (undercloud) [stack@undercloud-0 ~]$ (undercloud) [stack@undercloud-0 ~]$ (undercloud) [stack@undercloud-0 ~]$ . overcloudrc (overcloud) [stack@undercloud-0 ~]$ neutron agent-list neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | id | agent_type | host | availability_zone | alive | admin_state_up | binary | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | 0de9d54e-ce27-4b90-a534-a19237b594ec | DHCP agent | controller-0.localdomain | nova | :-) | True | neutron-dhcp-agent | | 2e2a40ba-ddc8-4601-804d-fd330e70e3d2 | Open vSwitch agent | compute-0.localdomain | | :-) | True | neutron-openvswitch-agent | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | L3 agent | controller-0.localdomain | nova | :-) | True | neutron-l3-agent | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | L3 agent | controller-1.localdomain | nova | :-) | True | neutron-l3-agent | | 5990e4e7-4761-4eae-9e8a-b24c259eea80 | Open vSwitch agent | controller-1.localdomain | | :-) | True | neutron-openvswitch-agent | | 5af4ab0c-b5e3-4ddd-a8a1-77bd9f487755 | DHCP agent | controller-2.localdomain | nova | :-) | True | neutron-dhcp-agent | | 69bde103-3f7d-4b91-ab24-e2cf5a509953 | DHCP agent | controller-1.localdomain | nova | :-) | True | neutron-dhcp-agent | | 6a6b3908-a08a-420e-abc2-5313d9c8834c | Metadata agent | controller-1.localdomain | | :-) | True | neutron-metadata-agent | | 6b0e7dde-4a83-4faf-9026-d0dc0074d99d | Open vSwitch agent | controller-2.localdomain | | :-) | True | neutron-openvswitch-agent | | 72cd089c-985f-4673-833c-7fc07e4a18f1 | Metadata agent | controller-0.localdomain | | :-) | True | neutron-metadata-agent | | 774a22c2-c5f1-408d-8e96-7056a545d408 | L3 agent | controller-2.localdomain | nova | :-) | True | neutron-l3-agent | | 8a1ee255-3a20-4f1a-855c-b98a6f5488c1 | Open vSwitch agent | controller-0.localdomain | | :-) | True | neutron-openvswitch-agent | | ef97f2e0-c55e-4401-bade-226fbe2ad8d3 | Metadata agent | controller-2.localdomain | | :-) | True | neutron-metadata-agent | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ (overcloud) [stack@undercloud-0 ~]$ date Wed Mar 7 06:33:00 EST 2018 (overcloud) [stack@undercloud-0 ~]$ date Wed Mar 7 06:33:16 EST 2018 (overcloud) [stack@undercloud-0 ~]$ neutron agent-list neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | id | agent_type | host | availability_zone | alive | admin_state_up | binary | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | 0de9d54e-ce27-4b90-a534-a19237b594ec | DHCP agent | controller-0.localdomain | nova | :-) | True | neutron-dhcp-agent | | 2e2a40ba-ddc8-4601-804d-fd330e70e3d2 | Open vSwitch agent | compute-0.localdomain | | :-) | True | neutron-openvswitch-agent | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | L3 agent | controller-0.localdomain | nova | :-) | True | neutron-l3-agent | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | L3 agent | controller-1.localdomain | nova | :-) | True | neutron-l3-agent | | 5990e4e7-4761-4eae-9e8a-b24c259eea80 | Open vSwitch agent | controller-1.localdomain | | :-) | True | neutron-openvswitch-agent | | 5af4ab0c-b5e3-4ddd-a8a1-77bd9f487755 | DHCP agent | controller-2.localdomain | nova | :-) | True | neutron-dhcp-agent | | 69bde103-3f7d-4b91-ab24-e2cf5a509953 | DHCP agent | controller-1.localdomain | nova | :-) | True | neutron-dhcp-agent | | 6a6b3908-a08a-420e-abc2-5313d9c8834c | Metadata agent | controller-1.localdomain | | :-) | True | neutron-metadata-agent | | 6b0e7dde-4a83-4faf-9026-d0dc0074d99d | Open vSwitch agent | controller-2.localdomain | | :-) | True | neutron-openvswitch-agent | | 72cd089c-985f-4673-833c-7fc07e4a18f1 | Metadata agent | controller-0.localdomain | | :-) | True | neutron-metadata-agent | | 774a22c2-c5f1-408d-8e96-7056a545d408 | L3 agent | controller-2.localdomain | nova | :-) | True | neutron-l3-agent | | 8a1ee255-3a20-4f1a-855c-b98a6f5488c1 | Open vSwitch agent | controller-0.localdomain | | :-) | True | neutron-openvswitch-agent | | ef97f2e0-c55e-4401-bade-226fbe2ad8d3 | Metadata agent | controller-2.localdomain | | :-) | True | neutron-metadata-agent | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 774a22c2-c5f1-408d-8e96-7056a545d408 | controller-2.localdomain | True | :-) | active | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | controller-1.localdomain | True | :-) | active | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | controller-0.localdomain | True | :-) | standby | +--------------------------------------+--------------------------+----------------+-------+----------+ (overcloud) [stack@undercloud-0 ~]$ date Wed Mar 7 06:33:32 EST 2018 (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 774a22c2-c5f1-408d-8e96-7056a545d408 | controller-2.localdomain | True | :-) | active | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | controller-1.localdomain | True | :-) | active | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | controller-0.localdomain | True | :-) | standby | +--------------------------------------+--------------------------+----------------+-------+----------+ (overcloud) [stack@undercloud-0 ~]$ date Wed Mar 7 06:36:35 EST 2018 (overcloud) [stack@undercloud-0 ~]$ neutron agent-list neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | id | agent_type | host | availability_zone | alive | admin_state_up | binary | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ | 0de9d54e-ce27-4b90-a534-a19237b594ec | DHCP agent | controller-0.localdomain | nova | :-) | True | neutron-dhcp-agent | | 2e2a40ba-ddc8-4601-804d-fd330e70e3d2 | Open vSwitch agent | compute-0.localdomain | | :-) | True | neutron-openvswitch-agent | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | L3 agent | controller-0.localdomain | nova | :-) | True | neutron-l3-agent | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | L3 agent | controller-1.localdomain | nova | :-) | True | neutron-l3-agent | | 5990e4e7-4761-4eae-9e8a-b24c259eea80 | Open vSwitch agent | controller-1.localdomain | | :-) | True | neutron-openvswitch-agent | | 5af4ab0c-b5e3-4ddd-a8a1-77bd9f487755 | DHCP agent | controller-2.localdomain | nova | :-) | True | neutron-dhcp-agent | | 69bde103-3f7d-4b91-ab24-e2cf5a509953 | DHCP agent | controller-1.localdomain | nova | :-) | True | neutron-dhcp-agent | | 6a6b3908-a08a-420e-abc2-5313d9c8834c | Metadata agent | controller-1.localdomain | | :-) | True | neutron-metadata-agent | | 6b0e7dde-4a83-4faf-9026-d0dc0074d99d | Open vSwitch agent | controller-2.localdomain | | :-) | True | neutron-openvswitch-agent | | 72cd089c-985f-4673-833c-7fc07e4a18f1 | Metadata agent | controller-0.localdomain | | :-) | True | neutron-metadata-agent | | 774a22c2-c5f1-408d-8e96-7056a545d408 | L3 agent | controller-2.localdomain | nova | :-) | True | neutron-l3-agent | | 8a1ee255-3a20-4f1a-855c-b98a6f5488c1 | Open vSwitch agent | controller-0.localdomain | | :-) | True | neutron-openvswitch-agent | | ef97f2e0-c55e-4401-bade-226fbe2ad8d3 | Metadata agent | controller-2.localdomain | | :-) | True | neutron-metadata-agent | +--------------------------------------+--------------------+--------------------------+-------------------+-------+----------------+---------------------------+ (overcloud) [stack@undercloud-0 ~]$ date Wed Mar 7 06:36:53 EST 2018 (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 774a22c2-c5f1-408d-8e96-7056a545d408 | controller-2.localdomain | True | :-) | active | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | controller-1.localdomain | True | :-) | active | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | controller-0.localdomain | True | :-) | standby | +--------------------------------------+--------------------------+----------------+-------+----------+ (overcloud) [stack@undercloud-0 ~]$ date Wed Mar 7 06:39:33 EST 2018 (overcloud) [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+--------------------------+----------------+-------+----------+ | 774a22c2-c5f1-408d-8e96-7056a545d408 | controller-2.localdomain | True | :-) | active | | 5528b623-c4f3-45c4-8f8a-ec95a70c7934 | controller-1.localdomain | True | :-) | active | | 413ddd48-ed78-4694-8c10-5a94fc1ca5bd | controller-0.localdomain | True | :-) | standby | +--------------------------------------+--------------------------+----------------+-------+----------+
Can you please check the HA network port status also?
Marking it as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1527130. What's happening is that you're restarting the L3 agent container of the active replica, which kills keepalived (This is the root cause of the bug, as detailed in 1527130). A backup spins up (since it stopped receiving VRRP hello messages from the node that keepalived died on), but the namespace and IPs still exist on the original node, so we end up with two actives. Fixing 1527130 will also fix this bug. *** This bug has been marked as a duplicate of bug 1527130 ***