When deleting 1 vm from horizon everything is fine, however when selecting more than 2 vms to be deleted, the connection to the vms hosted on the same qrouter is interrupted for ~4s [heat-admin@overcloud-controller-1 ~]$ nova list +--------------------------------------+----------+--------+------------+-------------+-------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+----------+--------+------------+-------------+-------------------------------------+ | 5d9a5238-1872-4b53-8be3-9bdce2563e6e | test1-vm | ACTIVE | - | Running | internal=192.168.3.103, 192.0.2.107 | | 335da0aa-da31-48be-9d66-70c89cc98a04 | test2-vm | ACTIVE | - | Running | internal=192.168.3.104, 192.0.2.106 | | 24e7c7ee-5fe7-4f2b-8076-b63bd5a5d0ce | test3-vm | ACTIVE | - | Running | internal=192.168.3.105, 192.0.2.105 | | 4bf123e9-fef7-4d5f-9666-0a0dcd228e50 | test4-vm | ACTIVE | - | Running | internal=192.168.3.106, 192.0.2.104 | +--------------------------------------+----------+--------+------------+-------------+-------------------------------------+ [heat-admin@overcloud-controller-1 ~]$ neutron router-list ----------------------------------------------------------------------------+-------------+------+ | 2e468f1d-4405-41cb-806b-11c172ef256d | router1 | {"network_id": "7176f07c-47ce-4687-9d77-0afa5fad74ff", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "cd07bc48-dfe7-473c-a740-5cc3767b987f", "ip_address": "192.0.2.103"}]} | False | True | ----------------------------------------------------------------------------+-------------+------+ [heat-admin@overcloud-controller-1 ~]$ neutron l3-agent-list-hosting-router router1 +--------------------------------------+------------------------------------+----------------+-------+----------+ | id | host | admin_state_up | alive | ha_state | +--------------------------------------+------------------------------------+----------------+-------+----------+ | b4b09c17-942c-41d1-baf3-af1e26ae0a6b | overcloud-controller-1.localdomain | True | :-) | standby | | eb3d2d90-6658-4564-9f5d-7cc958e71d96 | overcloud-controller-0.localdomain | True | :-) | standby | | eadf9303-7273-4336-ab1e-2b81e916a631 | overcloud-controller-2.localdomain | True | :-) | active | +--------------------------------------+------------------------------------+----------------+-------+----------+ [root@overcloud-controller-2 ~]# ip netns exec qrouter-2e468f1d-4405-41cb-806b-11c172ef256d ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 17: ha-5b2470f2-f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN link/ether fa:16:3e:e7:d1:77 brd ff:ff:ff:ff:ff:ff inet 169.254.192.1/18 brd 169.254.255.255 scope global ha-5b2470f2-f2 valid_lft forever preferred_lft forever inet 169.254.0.1/24 scope global ha-5b2470f2-f2 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fee7:d177/64 scope link valid_lft forever preferred_lft forever 18: qg-eb594fa1-23: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN link/ether fa:16:3e:50:a1:62 brd ff:ff:ff:ff:ff:ff inet 192.0.2.103/24 scope global qg-eb594fa1-23 valid_lft forever preferred_lft forever inet 192.0.2.104/32 scope global qg-eb594fa1-23 valid_lft forever preferred_lft forever inet 192.0.2.105/32 scope global qg-eb594fa1-23 valid_lft forever preferred_lft forever inet 192.0.2.106/32 scope global qg-eb594fa1-23 valid_lft forever preferred_lft forever inet 192.0.2.107/32 scope global qg-eb594fa1-23 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe50:a162/64 scope link nodad valid_lft forever preferred_lft forever 20: qr-21e49ce2-81: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN link/ether fa:16:3e:ae:a6:02 brd ff:ff:ff:ff:ff:ff inet 192.168.3.1/24 scope global qr-21e49ce2-81 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:feae:a602/64 scope link nodad valid_lft forever preferred_lft forever Monitoring in the namespace heat-admin@overcloud-controller-2 ~]$ sudo ip netns exec qrouter-2e468f1d-4405-41cb-806b-11c172ef256d ip -o monitor address Deleted 18: qg-eb594fa1-23 inet 192.0.2.107/32 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever Deleted 18: qg-eb594fa1-23 inet 192.0.2.106/32 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever Deleted 17: ha-5b2470f2-f2 inet 169.254.0.1/24 scope global ha-5b2470f2-f2\ valid_lft forever preferred_lft forever Deleted 18: qg-eb594fa1-23 inet 192.0.2.103/24 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever Deleted 18: qg-eb594fa1-23 inet 192.0.2.104/32 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever Deleted 18: qg-eb594fa1-23 inet 192.0.2.105/32 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever Deleted 20: qr-21e49ce2-81 inet 192.168.3.1/24 scope global qr-21e49ce2-81\ valid_lft forever preferred_lft forever Deleted 18: qg-eb594fa1-23 inet6 fe80::f816:3eff:fe50:a162/64 scope link nodad \ valid_lft forever preferred_lft forever Deleted 20: qr-21e49ce2-81 inet6 fe80::f816:3eff:feae:a602/64 scope link nodad \ valid_lft forever preferred_lft forever 17: ha-5b2470f2-f2 inet 169.254.0.1/24 scope global ha-5b2470f2-f2\ valid_lft forever preferred_lft forever 18: qg-eb594fa1-23 inet 192.0.2.103/24 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever 18: qg-eb594fa1-23 inet 192.0.2.104/32 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever 18: qg-eb594fa1-23 inet 192.0.2.105/32 scope global qg-eb594fa1-23\ valid_lft forever preferred_lft forever 20: qr-21e49ce2-81 inet 192.168.3.1/24 scope global qr-21e49ce2-81\ valid_lft forever preferred_lft forever 18: qg-eb594fa1-23 inet6 fe80::f816:3eff:fe50:a162/64 scope link nodad \ valid_lft forever preferred_lft forever 20: qr-21e49ce2-81 inet6 fe80::f816:3eff:feae:a602/64 scope link nodad \ valid_lft forever preferred_lft forever Initiated ping to test4-vm [root@overcloud-controller-2 ~]# ping 192.0.2.104 PING 192.0.2.104 (192.0.2.104) 56(84) bytes of data. 64 bytes from 192.0.2.104: icmp_seq=1 ttl=64 time=2.83 ms 64 bytes from 192.0.2.104: icmp_seq=2 ttl=64 time=0.893 ms 64 bytes from 192.0.2.104: icmp_seq=3 ttl=64 time=0.499 ms 64 bytes from 192.0.2.104: icmp_seq=4 ttl=64 time=0.640 ms 64 bytes from 192.0.2.104: icmp_seq=5 ttl=64 time=0.644 ms ... 64 bytes from 192.0.2.104: icmp_seq=116 ttl=64 time=0.642 ms 64 bytes from 192.0.2.104: icmp_seq=117 ttl=64 time=0.922 ms 64 bytes from 192.0.2.104: icmp_seq=118 ttl=64 time=12.7 ms 64 bytes from 192.0.2.104: icmp_seq=119 ttl=64 time=0.515 ms 64 bytes from 192.0.2.104: icmp_seq=120 ttl=64 time=1.69 ms ping: sendmsg: Network is unreachable <= ping: sendmsg: Network is unreachable <= This is when the other 2 vms are del ping: sendmsg: Network is unreachable <= ping: sendmsg: Network is unreachable <= 64 bytes from 192.0.2.104: icmp_seq=125 ttl=64 time=2.36 ms 64 bytes from 192.0.2.104: icmp_seq=126 ttl=64 time=0.746 ms 64 bytes from 192.0.2.104: icmp_seq=127 ttl=64 time=1.11 ms 64 bytes from 192.0.2.104: icmp_seq=128 ttl=64 time=0.646 ms 64 bytes from 192.0.2.104: icmp_seq=129 ttl=64 time=0.661 ms
Also this environment setup is [stack@undercloud ~]$ nova list +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | 2d9c97b8-dc5b-4c97-bb0f-50427c91ab90 | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=192.0.2.7 | | 2a0f9a9d-983e-47b4-9127-b6152996b71e | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.0.2.9 | | 60937186-a959-427f-b9eb-d7abcfca54c7 | overcloud-controller-1 | ACTIVE | - | Running | ctlplane=192.0.2.8 | | d2a6a304-bc20-4a44-b698-9bcea58172a7 | overcloud-controller-2 | ACTIVE | - | Running | ctlplane=192.0.2.10 | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+
Router setup HA, no DVR here.
The VMs were deleted through Horizon (so concurrently is a safe bet). Can we get the configuration files and the neutron logs for all neutron processes in all nodes, please? :)
As an additional note to this it also occurs if you delete VMs with floating IPs sequentially within a short period via the API.
Observations: 1. For some reason, when this occurs, the ha- interface loses its 169.254.0.1/24 for 4-5 seconds, and then it gets it back. l3 agent/openvswitch logs doesn't indicate either of them doing the actual deletion, so it's probably hidden somewhere in the code. 2. An easier reproduction can be to simply disassociate the floating IP from the instances. No need to delete the entire instance. 3. This seems to be occurring only when disassociating floating IPs from 2 more more instances - one is not enough. Also, when putting some kind of sleep between them, this doesn't reproduce. This hints at some race condition. 4. the state-change log indicates this also occurs when associating floating IPs, and not only on disassociation.
The cause for Keepalived switching into backup (failing over) is because the specified actions (association and disassociation of floating ips) change the configuration of keepalived, and in addition sends SIGHUP to it. If 2 or more SIGHUPs are sent to keepalived in sequence, it will go into BACKUP (removing IP interfaces from the router's resources, which causes the connectivity loss) and restart VRRP negotiation. An approach we were thinking on Neutron's side is throttling SIGHUP to only send 1 every X seconds. Before we start working on this, it will be good to hear from Ryan if is there's any way to mitigate this on Keepalived's side?
Also, it's important to note that manually sending SIGHUP to a keepalived process twice in a row also triggers this go-to-BACKUP state.
(In reply to John Schwarz from comment #10) > The cause for Keepalived switching into backup (failing over) is because the > specified actions (association and disassociation of floating ips) change > the configuration of keepalived, and in addition sends SIGHUP to it. If 2 or > more SIGHUPs are sent to keepalived in sequence, it will go into BACKUP > (removing IP interfaces from the router's resources, which causes the > connectivity loss) and restart VRRP negotiation. What is changing in keepalived.conf when you do this? > An approach we were thinking on Neutron's side is throttling SIGHUP to only > send 1 every X seconds. Before we start working on this, it will be good to > hear from Ryan if is there's any way to mitigate this on Keepalived's side? First, I don't think throttling the frequency of SIGHUP is going to help in the long run. If keepalived is being signalled with SIGHUP multiple times in quick succession, the service is not going to be sending VRRP advertisements (since it is restarting), meaning another node will take over. Related question, does keepalived.conf have 'nopreempt' keyword declared?
After much discussion and quite a bit o investigation, here is my assessment of how keepalived is behaving: First some background. The keepalived master node will periodically send VRRP advertisements, as configured by 'advert_int'. When a backup node does not receive a VRRP advertisement within this interval (plus some skew time), a failover occurs. Now, the other important detail is that the master keepalived node will no send VRRP advertisements while in a signal handler. When a keepalived node in the master state received a SIGHUP, it will block other signals while is processes the SIGHUP. While the SIGHUP signal handler is being executed, the VRRP advertisements stop. If just one signal is handled, chances are good that the node will complete the signal handler and resume sending VRRP advertisements before the backup node begin a new election. Conversely, if the signal handler takes too long the backup node(s) will not receive the advertisement on time and force a new master election. If multiple signals are received in quick succession, they are effectively queued and the signal handler will be executed once per signal, serially. This causes long(er) time where the master is not sending advertisements since it is busy (overwhelmed) handling signals. Note that short advertisement intervals and/or multiple SIGHUPs will increase the likelihood of triggering a failover in this manner.
I've dug into this again. I noticed that in keepalived v1.2.20 this no longer happens (and in all versions up until then it does). The patch which (I believe) prevents this behaviour is https://github.com/acassen/keepalived/commit/6b20916a. It's important to note that I compiled and manually installed keepalived from sources to check this, as el7 only has 1.2.13. So the 'fix' (if this can be called a fix) is not available in rhel7 at all (yet?). Assaf, how do you want to proceed with this, given this new information?
After looking more closely at the patch referenced in comment #14, I fail to see how this patch will address the issue. The patch will do two things: 1. If the priority is set to 255 and nopreempt is *not* set, the internal state variable will be tweaked. My understanding is that the Neutron L3 HA agent does use 'nopreempt' and the priority is not set to 255. Please correct me if I am wrong. 2. If nopreempt is set and the default state is set to MASTER, the code will print a warning. I believe you have state set to BACKUP, so you will not see this warning. I'm questioning if this patch will in fact fix the problem you're seeing.
Verified in openstack-neutron-7.2.0-10.el7ost.noarch Created 5 VMs with floating IPs, Ping to one of them while removing all other VMs worked without interruption
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1540