Bug 1584779
| Summary: | [HA] Floating IP issues after introducing failures to Cluster nodes | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Tomas Jamrisko <tjamrisk> | ||||||||||||||
| Component: | opendaylight | Assignee: | Sridhar Gaddam <sgaddam> | ||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Tomas Jamrisko <tjamrisk> | ||||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||||
| Priority: | high | ||||||||||||||||
| Version: | 13.0 (Queens) | CC: | aadam, asuryana, mkolesni, nyechiel, sgaddam, skitt, tjamrisk | ||||||||||||||
| Target Milestone: | z3 | Keywords: | Triaged, ZStream | ||||||||||||||
| Target Release: | 13.0 (Queens) | ||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||
| OS: | Unspecified | ||||||||||||||||
| Whiteboard: | HA | ||||||||||||||||
| Fixed In Version: | opendaylight-8.3.0-4.el7ost | Doc Type: | Bug Fix | ||||||||||||||
| Doc Text: |
Cause: Null Pointer Exceptions (NPE) seen in Netvirt when some of the controller nodes were brought down.
Consequence: NPEs were causing some missing flows and stale group entries.
Fix: NPEs are now fixed and the OVS pipeline is programmed accordingly for the FloatingIP use-case.
Result: FIP use-case continues to work even when some disruptive tests are performed on the Controller/Compute nodes.
|
Story Points: | --- | ||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||
| Last Closed: | 2018-11-13 23:32:54 UTC | Type: | Bug | ||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
| Embargoed: | |||||||||||||||||
| Attachments: |
|
||||||||||||||||
|
Description
Tomas Jamrisko
2018-05-31 16:00:41 UTC
Created attachment 1446321 [details]
controller-0 logs
Created attachment 1446322 [details]
controller-2 logs
Created attachment 1446323 [details]
compute-1-ovslogs
Created attachment 1446324 [details]
compute-0-ovs logs
Created attachment 1446325 [details]
Controller-1 logs
Created attachment 1446326 [details]
controller-1 neutron server logs
This issue happens when the shard leader changes. Once failures are introduced nodes the shard leader changes and the packet punted to controller from ovs fails to reach the controller(netvirt, to be precise) . For a FIP traffic from undercloud, in the return path the packet will be send to the controller to learn the undercloud mac, this packet is not reaching the controller. When we faced this issues earlier , we had observed that if we introduce failures again so as to make the initial shard leader to get elected again, the fip was working. (In reply to Aswin Suryanarayanan from comment #7) > This issue happens when the shard leader changes. Once failures are > introduced nodes the shard leader changes and the packet punted to > controller from ovs fails to reach the controller(netvirt, to be precise) . > For a FIP traffic from undercloud, in the return path the packet will be > send to the controller to learn the undercloud mac, this packet is not > reaching the controller. When you say the packet is not reaching the controller, do you know where it's disappearing? Is it not reaching the VM/container hosting ODL? Or is it reaching the VM, but not being processed by the controller? (In reply to Stephen Kitt from comment #8) > (In reply to Aswin Suryanarayanan from comment #7) > > This issue happens when the shard leader changes. Once failures are > > introduced nodes the shard leader changes and the packet punted to > > controller from ovs fails to reach the controller(netvirt, to be precise) . > > For a FIP traffic from undercloud, in the return path the packet will be > > send to the controller to learn the undercloud mac, this packet is not > > reaching the controller. > > When you say the packet is not reaching the controller, do you know where > it's disappearing? Is it not reaching the VM/container hosting ODL? Or is it > reaching the VM, but not being processed by the controller? I am not sure were exactly it is dropped, it didn't reach till netvirt. Tomas do you some info regarding this? I don't know much more, I can try to get a deployment, reproduce the issue, but I'm not sure what to look for. Would you be willing to take a look at the broken deployment? (In reply to Aswin Suryanarayanan from comment #10) > (In reply to Stephen Kitt from comment #8) > > (In reply to Aswin Suryanarayanan from comment #7) > > > This issue happens when the shard leader changes. Once failures are > > > introduced nodes the shard leader changes and the packet punted to > > > controller from ovs fails to reach the controller(netvirt, to be precise) . > > > For a FIP traffic from undercloud, in the return path the packet will be > > > send to the controller to learn the undercloud mac, this packet is not > > > reaching the controller. > > > > When you say the packet is not reaching the controller, do you know where > > it's disappearing? Is it not reaching the VM/container hosting ODL? Or is it > > reaching the VM, but not being processed by the controller? > > I am not sure were exactly it is dropped, it didn't reach till netvirt. > Tomas do you some info regarding this? (In reply to Tomas Jamrisko from comment #11) > I don't know much more, I can try to get a deployment, reproduce the issue, > but I'm not sure what to look for. Would you be willing to take a look at > the broken deployment? Thomas, please reproduce this issue and provide us access to the environment. We will debug it. Thanks. > > (In reply to Aswin Suryanarayanan from comment #10) > > (In reply to Stephen Kitt from comment #8) > > > (In reply to Aswin Suryanarayanan from comment #7) > > > > This issue happens when the shard leader changes. Once failures are > > > > introduced nodes the shard leader changes and the packet punted to > > > > controller from ovs fails to reach the controller(netvirt, to be precise) . > > > > For a FIP traffic from undercloud, in the return path the packet will be > > > > send to the controller to learn the undercloud mac, this packet is not > > > > reaching the controller. > > > > > > When you say the packet is not reaching the controller, do you know where > > > it's disappearing? Is it not reaching the VM/container hosting ODL? Or is it > > > reaching the VM, but not being processed by the controller? > > > > I am not sure were exactly it is dropped, it didn't reach till netvirt. > > Tomas do you some info regarding this? Had a look at the setup along with @Aswin and we found the following issues.
Setup: 2 Computes and 3 Controllers
The setup had three VMs where FIPs were not working.
NAPT Switch is located on Controller-0
Tenant network (net1) is VxLAN and public network is FLAT network
vm5 : MAC: fa:16:3e:fa:b3:fc, net1=192.168.99.5, 10.0.0.216, on Compute-1
vm6 : MAC: fa:16:3e:8b:4a:3b, net1=192.168.99.6, 10.0.0.225 on Compute-0
vm9 : MAC: fa:16:3e:00:90:52, net1=192.168.99.9, 10.0.0.213 on Compute-1
While trying to ping from the undercloud to the FIP (same issue when VM was trying to ping to DC-GW/Undercloud), the packet is getting dropped due to one of the following reasons.
Issue-1: Stale Group entry seen for vm5 and vm6
===============================================
sudo ovs-appctl ofproto/trace br-int 'in_port=1,dl_src=52:54:00:d6:b0:82,dl_dst=fa:16:3e:3f:e5:1b,dl_type=0x0800,nw_src=10.0.0.1,nw_dst=10.0.0.216,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0'
Flow: icmp,in_port=1,vlan_tci=0x0000,dl_src=52:54:00:d6:b0:82,dl_dst=fa:16:3e:3f:e5:1b,nw_src=10.0.0.1,nw_dst=10.0.0.216,nw_tos=0,nw_ecn=0,nw_ttl=128,icmp_type=8,icmp_code=0
----------------
0. in_port=1,vlan_tci=0x0000/0x1fff, priority 4, cookie 0x8000000
write_metadata:0x180000000001/0xffffff0000000001
goto_table:17
17. metadata=0x180000000000/0xffffff0000000000, priority 10, cookie 0x8000001
load:0x19e10->NXM_NX_REG3[0..24]
write_metadata:0x9000180000033c20/0xfffffffffffffffe
goto_table:19
19. metadata=0x33c20/0xfffffe,dl_dst=fa:16:3e:3f:e5:1b, priority 20, cookie 0x8000009
write_metadata:0x33c22/0xfffffe
goto_table:21
21. ip,metadata=0x33c22/0xfffffe,nw_dst=10.0.0.216, priority 42, cookie 0x8000003
set_field:fa:16:3e:3f:e5:1b->eth_dst
goto_table:25
25. ip,dl_dst=fa:16:3e:3f:e5:1b,nw_dst=10.0.0.216, priority 10, cookie 0x8000004
set_field:192.168.99.5->ip_dst
write_metadata:0x33c26/0xfffffe
goto_table:27
27. ip,metadata=0x33c26/0xfffffe,nw_dst=192.168.99.5, priority 10, cookie 0x8000004
resubmit(,21)
21. ip,metadata=0x33c26/0xfffffe,nw_dst=192.168.99.5, priority 42, cookie 0x8000003
group:155003
set_field:fa:16:3e:ee:48:f9->eth_src
set_field:fa:16:3e:8c:0c:5c->eth_dst ====> The MAC address does not belong to vm5
load:0x1d00->NXM_NX_REG6[]
resubmit(,220)
220. No match.
drop
When we looked at the config store, we could see an entry in the groupTable with the wrong MAC address.
Issue-2: missing FIB entry seen for vm9
========================================
[heat-admin@compute-1 SampleScripts]$ sudo ovs-appctl ofproto/trace br-int 'in_port=1,dl_src=52:54:00:d6:b0:82,dl_dst=fa:16:3e:ee:f3:07,dl_type=0x0800,nw_src=10.0.0.1,nw_dst=10.0.0.213,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0'
0. in_port=1,vlan_tci=0x0000/0x1fff, priority 4, cookie 0x8000000
write_metadata:0x180000000001/0xffffff0000000001
goto_table:17
17. metadata=0x180000000000/0xffffff0000000000, priority 10, cookie 0x8000001
load:0x19e10->NXM_NX_REG3[0..24]
write_metadata:0x9000180000033c20/0xfffffffffffffffe
goto_table:19
19. metadata=0x33c20/0xfffffe,dl_dst=fa:16:3e:ee:f3:07, priority 20, cookie 0x8000009
write_metadata:0x33c22/0xfffffe
goto_table:21
21. ip,metadata=0x33c22/0xfffffe,nw_dst=10.0.0.213, priority 42, cookie 0x8000003
set_field:fa:16:3e:ee:f3:07->eth_dst
goto_table:25
25. ip,dl_dst=fa:16:3e:ee:f3:07,nw_dst=10.0.0.213, priority 10, cookie 0x8000004
set_field:192.168.99.9->ip_dst
write_metadata:0x33c26/0xfffffe
goto_table:27
27. ip,metadata=0x33c26/0xfffffe,nw_dst=192.168.99.9, priority 10, cookie 0x8000004
resubmit(,21)
21. ip,metadata=0x33c26/0xfffffe,nw_dst=192.168.99.0/24, priority 34, cookie 0x8000003
write_metadata:0x157e033c26/0xfffffffffe
goto_table:22
22. priority 0, cookie 0x8000004
CONTROLLER:65535
The issue here is a missing FIB entry and the reason why the FIB entry is missing is due to a missing group entry.
Basically Config store indeed has an entry for 192.168.99.9 (FIP: 10.0.0.213) in Table21, but the action in the flow was to send the packet to a specific GroupId which was missing on the Switch (and also in the config store). When there is a request made to the switch to program the flow, OVS-Switch (on Compute-1) seem to reject it as there is no corresponding groupEntry.
Some errors:
============
2018-06-06T08:58:43,799 | ERROR | ForkJoinPool-1-worker-1 | ElanForwardingEntriesHandler | 347 - org.opendaylight.netvirt.elanmanager-impl - 0.6.0.redhat-10 | Static MAC address PhysAddress [_value=fa:16:3e:8c:0c:5c] has already been added for the same ElanInstance bbd99f4b-6ef8-48e2-9e3c-a580fba3eff2 on the same Logical Interface Port 7f06eb09-66d5-4bd3-97b7-d4a8d8ac7ac6. No operation will be done.
2018-06-06T13:33:09.332Z|00558|rconn|INFO|br-int<->tcp:172.17.1.15:6653: connected
2018-06-06T13:33:09.834Z|00559|connmgr|INFO|br-int<->tcp:172.17.1.19:6653: sending OFPGMFC_GROUP_EXISTS error reply to OFPT_GROUP_MOD message
2018-06-06T13:33:10.688Z|00560|ofp_actions|WARN|bad action at offset 0 (OFPBMC_BAD_FIELD):
00000000 00 19 00 08 00 00 00 00-ff ff 00 18 00 00 23 20
00000010 00 07 00 1f 00 01 0c 04-00 00 00 00 00 00 01 00
00000020 ff ff 00 10 00 00 23 20-00 0e ff f8 dc 00 00 00
2018-06-06T13:33:10.688Z|00561|connmgr|INFO|br-int<->tcp:172.17.1.19:6653: sending OFPBMC_BAD_FIELD error reply to OFPT_FLOW_MOD message
2018-06-06T13:33:10.688Z|00562|connmgr|INFO|br-int<->tcp:172.17.1.19:6653: sending OFPBAC_BAD_OUT_GROUP error reply to OFPT_FLOW_MOD message
Observations/Next Steps:
========================
While searching for any pattern (like IPAddresses, MAC addresses etc), we could see couple of logs where the VM IPaddresses were re-used (i.e., when vm was deleted and a new VM spawned got the same IP), switch losing connections to the controllers, errors with group entry in ovs-switch etc.
It looks like this is a different issue to the one that @Aswin mentioned in Comment#7. @Tomas had to try different combinations (like rebooting Controller nodes, compute nodes, spawning VMs while the controller/shard-leader was down, dis-associate/re-associate FIP, delete VMs, spawn new VMs etc) before the FIPs started to fail. It would be helpful if some easy way is identified to reproduce this issue so that we can enable necessary logging to capture more details.
Looks like i can reproduce one of the FIP issues by reusing a previously created FIP. Start a VM, add FIP, kill compute on which the VM is running, delete vm, create a new VM, attach FIP Tomas, very nice, finally it can be reproduced :-). (In reply to Tomas Jamrisko from comment #17) > Looks like i can reproduce one of the FIP issues by reusing a previously > created FIP. > > Start a VM, add FIP, kill compute on which the VM is running, delete vm, > create a new VM, attach FIP I had a look at the setup and the reason for the failure is a "Stale Group entry". Basically when we try to ping FIP from the undercloud, it was seen that packet successfully gets DNAT'ed (0->17->19->21->25->27->21->GroupEntry->Table 220/drop), but from the FIB (21) table when its sent to GroupEntry (which is stale), its updated with wrong values and hence is getting dropped in Table 220. The stale group entry is as shown below. group_id=152503,type=all,bucket=actions=set_field:fa:16:3e:8f:a3:48->eth_src,set_field:fa:16:3e:0e:e2:7e->eth_dst,load:0x1d00->NXM_NX_REG6[],resubmit(,220) i.e., group:152503 set_field:fa:16:3e:8f:a3:48->eth_src ----> This is correct and points to the neutron router interface MAC set_field:fa:16:3e:0e:e2:7e->eth_dst ----> This is INCORRECT and points to MAC address of a deleted VM. load:0x1d00->NXM_NX_REG6[] resubmit(,220) Ideally, the group 152503 should be deleted when the corresponding VM is deleted, but since the ComputeNode hosting the VM was down when nova delete for the VM was invoked, the entry might have remained stale (due to some possible race-condition). Another observation is that tenant IP-address that is allocated to the VM got recycled and the same ip-address was allocated to a new VM. Since the GroupEntry was not deleted and remained as a stale entry on the ComputeNode, FIP use-case is failing. I had a look at the config datastore and could see that groupEntry in the config datastore was matching with the entry on the ComputeNode (i.e., stale entry). Further analysis showed two main NullPointerExceptions in Karaf console. 1. NPE at updateVpnInterfacesForUnProcessAdjancencies: Proposed the following patch which will address this exception. https://git.opendaylight.org/gerrit/#/c/73491/ 2. NPE with Unable to handle the TEP event: This issue is already fixed/merged in upstream as part of https://jira.opendaylight.org/browse/NETVIRT-1133 few days back. I tried to reproduce the issue locally 1) by re-using the same ip-address 2) by resetting the ComputeNode hosting the VM, but the issue was not reproduced. Next Steps: @Tomas, can you please retry the "same use-case" with an image that includes the fix that I proposed here - https://git.opendaylight.org/gerrit/#/c/73491/ and let us know the results. Tomas, could we try and run an automation on this bug? We'd like to see a long stable run. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3614 |