Bug 1773605

Summary: OCP IPv6 - Packets dropped on upcall from kernel after SNAT - Failed to acquire udpif_key corresponding to unexpected flow
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Mark McLoughlin <markmc>
Component: ovn2.11Assignee: OVN Team <ovnteam>
Status: CLOSED UPSTREAM QA Contact: Jianlin Shi <jishi>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: RHEL 7.7CC: ctrautma, fleitner, jhsiao, mmichels, ovs-qe, ovs-team, qding, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1773598
: 1775160 1776969 1776973 1776994 (view as bug list) Environment:
Last Closed: 2025-02-10 04:00:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1775160, 1776969, 1776973, 1776994    

Description Mark McLoughlin 2019-11-18 14:45:45 UTC
Description of problem:

With testing builds of OCP 4.3 with various changes to enable OVN and IPv6 support, we are seeing this issue.


Version-Release number of selected component (if applicable):

openvswitch2.12-2.12.0-4.el7fdp.x86_64.rpm


Additional info:

From ovs-daemons pod:

2019-11-18T10:42:17.973Z|00001|ofproto_dpif_upcall(handler6)|INFO|received packet on unassociated datapath port 4294967295
2019-11-18T10:42:18.221Z|00001|ofproto_dpif_upcall(revalidator8)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:9ba1081f-a692-4c1c-a79b-d1cf04175f7d

Comment 3 Numan Siddique 2019-11-19 06:52:35 UTC
This could very well be an ovn issue. I will take this BZ and investigate.

Comment 4 Mark McLoughlin 2019-11-19 13:18:45 UTC
Patch posted by Numan: https://patchwork.ozlabs.org/patch/1197423/

Comment 6 Numan Siddique 2019-11-21 17:53:10 UTC
The fix is available in ovn2.11-2.11.1-20

Comment 7 Jianlin Shi 2019-11-22 09:06:27 UTC
reproduced with following steps on openvswitch2.12-2.12.0-4.el7fdp.x86_64 and ovn2.12-2.12.0-14.el7fdp.x86_64:

server:
#!/bin/bash                
                                                   
systemctl start openvswitch
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641                                        
ovn-sbctl set-connection ptcp:6642                                        
                                                                             
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.29.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.29.25
systemctl restart ovn-controller     
                                                     
ip netns add server0                                 
ip link add veth0_s0 type veth peer name veth0_s0_p
ip link set veth0_s0 netns server0
ip netns exec server0 ip link set lo up
ip netns exec server0 ip link set veth0_s0 up        
ip netns exec server0 ip link set veth0_s0 address 00:00:00:01:01:02
ip netns exec server0 ip addr add 172.16.2.1/24 dev veth0_s0
ip netns exec server0 ip addr add 2001::1/64 dev veth0_s0             
ip netns exec server0 ip route add default via 172.16.2.254 dev veth0_s0
ip netns exec server0 ip -6 route add default via 2001::a dev veth0_s0
                    
ovs-vsctl add-port br-int veth0_s0_p
ip link set veth0_s0_p up                            
ovs-vsctl set interface veth0_s0_p external_ids:iface-id=ls2p1
                                                     
ovn-nbctl ls-add ls1
ovn-nbctl lsp-add ls1 ls1p1  
ovn-nbctl lsp-set-addresses ls1p1 00:00:00:01:00:02
                                       
ovn-nbctl ls-add ls2                                  
ovn-nbctl lsp-add ls2 ls2p1
ovn-nbctl lsp-set-addresses ls2p1 00:00:00:01:01:02 

ovn-nbctl lr-add lr1   
ovn-nbctl lrp-add lr1 lr1-ls1 00:de:ad:ff:01:01 172.16.1.254/24 2000::a/64
ovn-nbctl lrp-add lr1 lr1-ls2 00:de:ad:ff:01:02 172.16.2.254/24 2001::a/64
ovn-nbctl lrp-add lr1 lr1-ls0 00:de:ad:ff:01:03 192.168.111.254/24 3000::a/64
ovn-nbctl lsp-add ls1 ls1-lr1 
ovn-nbctl lsp-set-type ls1-lr1 router              
ovn-nbctl lsp-set-addresses ls1-lr1 00:de:ad:ff:01:01
ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1
                                                               
ovn-nbctl lsp-add ls2 ls2-lr1                            
ovn-nbctl lsp-set-type ls2-lr1 router                         
ovn-nbctl lsp-set-addresses ls2-lr1 00:de:ad:ff:01:02    
ovn-nbctl lsp-set-options ls2-lr1 router-port=lr1-ls2
                                    
ovn-nbctl set Logical_Router_Port lr1-ls2 options:redirect-chassis=hv1
# for public
ovn-nbctl ls-add ls0
ovn-nbctl lsp-add ls0 ls0-lr1
ovn-nbctl lsp-set-addresses ls0-lr1 00:de:ad:ff:01:03
ovn-nbctl lsp-set-type ls0-lr1 router
ovn-nbctl lsp-set-options ls0-lr1 router-port=lr1-ls0

ovn-nbctl lsp-add ls0 ln_port
ovn-nbctl lsp-set-addresses ln_port unknown
ovn-nbctl lsp-set-type ln_port localnet
ovn-nbctl lsp-set-options ln_port network_name=nattest

ovn-nbctl set logical_router lr1 options:chassis=hv1

ovs-vsctl add-br br-nat
ovs-vsctl set open . external_ids:ovn-bridge-mappings=nattest:br-nat
ip link set br-nat up

ip netns add server1 up
ip link add veth0_s1 type veth peer name veth0_s1_p
ip link set veth0_s1 netns server1
ip netns exec server1 ip link set veth0_s1 up
ip netns exec server1 ip addr add 192.168.111.1/24 dev veth0_s1
ip netns exec server1 ip addr add 3000::1/64 dev veth0_s1
ip netns exec server1 ip route add default via 192.168.111.254
ip netns exec server1 ip -6 route add default via 3000::a

ovs-vsctl add-port br-nat veth0_s1_p
ip link set veth0_s1_p up 

ovn-nbctl lr-nat-add lr1 dnat_and_snat 3000::55 2001::1

on client:
systemctl start openvswitch

ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.29.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.29.26
systemctl restart ovn-controller


ip netns add client0
ip link add veth0_c0 type veth peer name veth0_c0_p
ip link set veth0_c0 netns client0
ip netns exec client0 ip link set lo up                                                               
ip netns exec client0 ip link set veth0_c0 up
ip netns exec client0 ip link set veth0_c0 address 00:00:00:01:00:02
ip netns exec client0 ip addr add 172.16.1.1/24 dev veth0_c0
ip netns exec client0 ip addr add 2000::1/64 dev veth0_c0
ip netns exec client0 ip route add default via 172.16.1.254 dev veth0_c0
ip netns exec client0 ip -6 route add default via 2000::a dev veth0_c0
ovs-vsctl add-port br-int veth0_c0_p
ip link set veth0_c0_p up
ovs-vsctl set interface veth0_c0_p external_ids:iface-id=ls1p1

then run ping6 on server:

ip netns exec server0 ping6 3000::2 -c 1

then get the error log in /var/log/openvswitch/ovs-vswitchd.log:

[root@dell-per740-12 bz1773605]# grep ofproto_dpif_upcall /var/log/openvswitch/ovs-vswitchd.log 
2019-11-22T09:02:04.063Z|00001|ofproto_dpif_upcall(handler50)|INFO|received packet on unassociated datapath port 4294967295
2019-11-22T09:02:04.083Z|00001|ofproto_dpif_upcall(revalidator85)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:cfa90710-073d-4516-abfe-5117d64e82ae

Comment 8 Jianlin Shi 2019-11-22 09:36:15 UTC
with setup on comment 7.
on ovn2.12.0-7:
when I tried to ping 3000::1 on server0 with ip netns exec server0 ping6 3000::1 -c 1.
server1 can't receive the ipv6 ns, then ping6 would always fail.
[root@dell-per740-12 bz1773605]# ip netns exec server0 ping6 3000::1 -c 2
PING 3000::1(3000::1) 56 data bytes

--- 3000::1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1001ms
[root@dell-per740-12 ~]# ip netns exec server1 tcpdump -i any -nnle                                   
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode                            
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
04:31:04.816918   B 00:de:ad:ff:01:03 ethertype ARP (0x0806), length 44: Request who-has 192.168.111.254 tell 192.168.111.254, length 28
^C
1 packet captured
1 packet received by filter
0 packets dropped by kernel

<=== no NS received on server1

Comment 9 Jianlin Shi 2019-11-22 09:48:27 UTC
and on ovn2.12.0-15:

[root@dell-per740-12 bz1773605]# rpm -qa | grep ovn
ovn2.12-2.12.0-15.el7fdp.x86_64
ovn2.12-host-2.12.0-15.el7fdp.x86_64
ovn2.12-central-2.12.0-15.el7fdp.x86_64

[root@dell-per740-12 bz1773605]# ip netns exec server0 ping6 3000::1 -c 2
PING 3000::1(3000::1) 56 data bytes
64 bytes from 3000::1: icmp_seq=2 ttl=63 time=2.22 ms                                                 

--- 3000::1 ping statistics ---                                                                       
2 packets transmitted, 1 received, 50% packet loss, time 1000ms
rtt min/avg/max/mdev = 2.228/2.228/2.228/0.000 ms

[root@dell-per740-12 ~]# ip netns exec server1 tcpdump -i any -nnle                                   
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes                       
04:46:17.376666   M 00:de:ad:ff:01:03 ethertype IPv6 (0x86dd), length 88: 2001::1 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 3000::1, length 32

<=== NS received

04:46:17.376726 Out fe:89:60:69:07:bf ethertype IPv6 (0x86dd), length 88: 3000::1 > 2001::1: ICMP6, neighbor advertisement, tgt is 3000::1, length 32
04:46:18.375131  In 00:de:ad:ff:01:03 ethertype IPv6 (0x86dd), length 120: 3000::55 > 3000::1: ICMP6, echo request, seq 2, length 64
04:46:18.375226 Out fe:89:60:69:07:bf ethertype IPv6 (0x86dd), length 88: 3000::1 > ff02::1:ff00:55: ICMP6, neighbor solicitation, who has 3000::55, length 32
04:46:18.376061  In 00:de:ad:ff:01:03 ethertype IPv6 (0x86dd), length 88: 3000::55 > 3000::1: ICMP6, neighbor advertisement, tgt is 3000::55, length 32
04:46:18.376094 Out fe:89:60:69:07:bf ethertype IPv6 (0x86dd), length 120: 3000::1 > 3000::55: ICMP6, echo reply, seq 2, length 64

but the error message would also occur the first time ping6 run:
[root@dell-per740-12 bz1773605]# grep ofproto_dpif_upcall /var/log/openvswitch/ovs-vswitchd.log       
2019-11-22T09:46:17.388Z|00001|ofproto_dpif_upcall(handler52)|INFO|received packet on unassociated datapath port 4294967295
2019-11-22T09:46:17.389Z|00001|ofproto_dpif_upcall(revalidator85)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:161ddea7-25f6-4ebb-8eec-0f63feaac816

Comment 10 Jianlin Shi 2019-11-22 10:09:00 UTC
ovs i used is 2.12.0-4:
[root@dell-per740-12 bz1773605]# rpm -qa  | grep openvs                                               
openvswitch2.12-2.12.0-4.el7fdp.x86_64                                                                
openvswitch-selinux-extra-policy-1.0-14.el7fdp.noarch

Comment 11 Numan Siddique 2019-11-22 10:33:43 UTC
This bug addresses the issue in OVN so that the IPv6 NS packet generated by ovn-controller is not dropped.
Since you have verified that the NS packet is received and not dropped, I would say you can verify this BZ.
I have clone this bug on openvswitch2.12 component which should address  this issue properly in ovs-vswitchd and those logs should not be seen.

Comment 13 Jianlin Shi 2019-11-28 01:51:46 UTC
reproduced on ovn2.11.1-19:

[root@dell-per740-12 bz1773605]# ip netns exec server0 ping6 3000::2 -c 1                             
PING 3000::2(3000::2) 56 data bytes                                                                   
                                                                                                      
--- 3000::2 ping statistics ---                                                                       
1 packets transmitted, 0 received, 100% packet loss, time 0ms                                         
                                                                                                      
[root@dell-per740-12 bz1773605]# grep ofproto_dpif_upcall /var/log/openvswitch/ovs-vswitchd.log       
2019-11-28T01:38:32.281Z|00001|ofproto_dpif_upcall(handler50)|INFO|received packet on unassociated datapath port 4294967295
2019-11-28T01:38:32.295Z|00001|ofproto_dpif_upcall(revalidator85)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:8c542e09-56ac-404d-8756-21b3352e6a31
[root@dell-per740-12 bz1773605]# ip netns exec server0 ping6 3000::2 -c 1                             
PING 3000::2(3000::2) 56 data bytes                                                                   
                                                                                                      
--- 3000::2 ping statistics ---                                                                       
1 packets transmitted, 0 received, 100% packet loss, time 0ms  

<==== also failed to ping6 3000::1                                       
                                                                                                      
[root@dell-per740-12 bz1773605]# grep ofproto_dpif_upcall /var/log/openvswitch/ovs-vswitchd.log       
2019-11-28T01:38:32.281Z|00001|ofproto_dpif_upcall(handler50)|INFO|received packet on unassociated datapath port 4294967295
2019-11-28T01:38:32.295Z|00001|ofproto_dpif_upcall(revalidator85)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:8c542e09-56ac-404d-8756-21b3352e6a31
2019-11-28T01:38:50.696Z|00002|ofproto_dpif_upcall(handler50)|INFO|received packet on unassociated datapath port 4294967295
2019-11-28T01:38:50.852Z|00002|ofproto_dpif_upcall(revalidator85)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:36540ac4-8a29-44fb-835d-dc89aa5ae5be

<===== error message

[root@dell-per740-12 bz1773605]# rpm -qa | grep -E "openvswitch|ovn"
ovn2.11-host-2.11.1-19.el7fdp.x86_64                                                                  
openvswitch-selinux-extra-policy-1.0-14.el7fdp.noarch                                                 
ovn2.11-central-2.11.1-19.el7fdp.x86_64
openvswitch2.11-2.11.0-26.el7fdp.x86_64
ovn2.11-2.11.1-19.el7fdp.x86_64

[root@dell-per740-12 ~]# ip netns exec server1 tcpdump -i any -nnle
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
^C                                                                                                    
0 packets captured                                                                                    
0 packets received by filter                                                                          
0 packets dropped by kernel

<==== no NS received


Verified on ovn2.11.1-20:

[root@dell-per740-12 bz1773605]# ip netns exec server0 ping6 3000::1 -c 1
PING 3000::1(3000::1) 56 data bytes
64 bytes from 3000::1: icmp_seq=1 ttl=63 time=3.86 ms                                                 

--- 3000::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.864/3.864/3.864/0.000 ms              

<==== passed
                                       
[root@dell-per740-12 bz1773605]# rpm -qa | grep -E "openvswitch|ovn"
ovn2.11-2.11.1-20.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-14.el7fdp.noarch
ovn2.11-central-2.11.1-20.el7fdp.x86_64                                                               
openvswitch2.11-2.11.0-26.el7fdp.x86_64
ovn2.11-host-2.11.1-20.el7fdp.x86_64
[root@dell-per740-12 bz1773605]# ip netns exec server0 ping6 3000::2 -c 1
PING 3000::2(3000::2) 56 data bytes

--- 3000::2 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

[root@dell-per740-12 ~]# ip netns exec server1 tcpdump -i any -nnle
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes                                                                   
20:45:07.387857  In 00:de:ad:ff:01:03 ethertype IPv6 (0x86dd), length 120: 3000::55 > 3000::1: ICMP6, echo request, seq 1, length 64                                                                       
20:45:07.387895 Out ba:81:86:09:e0:cd ethertype IPv6 (0x86dd), length 120: 3000::1 > 3000::55: ICMP6, echo reply, seq 1, length 64                                                                         
20:46:57.014295   M 00:de:ad:ff:01:03 ethertype IPv6 (0x86dd), length 88: 2001::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 3000::2, length 32                                               

<==== NS received

[root@dell-per740-12 bz1773605]# grep ofproto_dpif_upcall /var/log/openvswitch/ovs-vswitchd.log
2019-11-28T01:38:32.281Z|00001|ofproto_dpif_upcall(handler50)|INFO|received packet on unassociated datapath port 4294967295
2019-11-28T01:38:32.295Z|00001|ofproto_dpif_upcall(revalidator85)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:8c542e09-56ac-404d-8756-21b3352e6a31

<=== also the error message still appear

as NS can be received, based on comment 11, set VERIFIED

Comment 17 Red Hat Bugzilla 2025-02-10 04:00:01 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.