The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2170920 - NXT_RESUME packet is dropped with error: received packet on unassociated datapath port 4294967295
Summary: NXT_RESUME packet is dropped with error: received packet on unassociated data...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.17
Version: FDP 19.03
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Paolo Valerio
QA Contact: Hekai Wang
URL:
Whiteboard:
Depends On:
Blocks: 2172622
TreeView+ depends on / blocked
 
Reported: 2023-02-17 16:38 UTC by Tim Rozet
Modified: 2023-11-14 18:32 UTC (History)
13 users (show)

Fixed In Version: openvswitch2.17-2.17.0-123.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2169839
Environment:
Last Closed: 2023-11-14 18:32:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2696 0 None None None 2023-02-17 16:38:58 UTC

Description Tim Rozet 2023-02-17 16:38:17 UTC
+++ This bug was initially created as a clone of Bug #2169839 +++

OVS version:
[root@ovn-worker ~]# rpm -qa | grep openvswitch
openvswitch-2.17.0-3.fc36.x86_64


Description of problem:
Recently OVN added the ability to handle sending ICMP needs frag. In my case I am triggering this behavior with a UDP packet and ovn-controller gets the packet_in and responds with an ICMP needs frag, however...somehow the ICMP response ends up being packet_in'ed again to ovn-controller:

2023-02-14T17:29:49.506Z|00004|pinctrl(ovn_pinctrl0)|DBG|pinctrl received  packet-in | opcode=ICMP4_ERROR| OF_Table_ID=27| OF_Cookie_ID=0xde8a2ac| in-port=3| src-mac=0a:58:64:40:00:02, dst-mac=0a:58:64:40:00:01| src-ip=169.254.169.2, dst-ip=10.244.0.6
 
2023-02-14T17:29:49.506Z|00005|pinctrl(ovn_pinctrl0)|DBG|pinctrl received  packet-in | opcode=PUT_ICMP4_FRAG_MTU| OF_Table_ID=0| OF_Cookie_ID=0xffffffffffffffff| in-port=65533| src-mac=0a:58:64:40:00:02, dst-mac=02:42:ac:12:00:02| src-ip=172.18.0.2, dst-ip=169.254.169.2

Version-Release number of selected component (if applicable):
[root@ovn-worker ~]# rpm -qa | grep ovn
ovn-22.12.90-1.fc37.x86_64
ovn-central-22.12.90-1.fc37.x86_64
ovn-host-22.12.90-1.fc37.x86_64
ovn-vtep-22.12.90-1.fc37.x86_64

How reproducible:
Everytime

Steps to Reproduce:
1. Launch kind in ovn-kubernetes (./kind.sh -wk 1)
2. Create a pod that will serve as an endpoint to a service:
cat ~/basic.yaml 
---
apiVersion: v1
kind: Pod
metadata:
  name: trozet3
  labels:
    pod-name: client
      #role: webserver
      #app: spk-coredns  
spec:
  containers:
  - name: pod1
    image: python
    #command:
    #  - /sbin/init
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 3000000; done;" ]
    imagePullPolicy: IfNotPresent
    ports:
      - name: dns-tcp
        containerPort: 53
        protocol: TCP  
      - name: dns-udp
        containerPort: 9999
        protocol: UDP
3. Create a service:
[trozet@fedora ~]$ cat ~/service.yaml 
apiVersion: v1
kind: Service
metadata:
  name: my-service1
  #namespace: exgw2
spec:
  selector:
      pod-name: client        
  ports:
    - protocol: UDP
      port: 1337
      targetPort: 1337
      name: blah
    - protocol: TCP
      port: 80
      targetPort: 80
      name: blah2  
  externalIPs:
    - 172.18.0.99
  type: NodePort
  externalTrafficPolicy: Local

4. The pod should be placed on ovn-worker node. Exec into the pod and run a UDP server (change localIP to match your pod IP, or use 0.0.0.0 and bind all):

import socket
localIP     = "10.244.0.6"
localPort   = 1337
bufferSize  = 1024

msgFromServer       = "Hello UDP Client"

bytesToSend         = str.encode(msgFromServer)

# Create a datagram socket

UDPServerSocket = socket.socket(family=socket.AF_INET, type=socket.SOCK_DGRAM)

# Bind to address and ip

UDPServerSocket.bind((localIP, localPort))
print("UDP server up and listening")

# Listen for incoming datagrams

while(True):
    bytesAddressPair = UDPServerSocket.recvfrom(bufferSize)
    message = bytesAddressPair[0]
    address = bytesAddressPair[1]
    clientMsg = "Message from Client:{}".format(message)
    clientIP  = "Client IP Address:{}".format(address)
    print(clientMsg)
    print(clientIP)
    # Sending a reply to client
    UDPServerSocket.sendto(bytesToSend, address)

5. Login to ovn-worker node and modify the 2 routes to not have MTU 1400:

169.254.169.1 dev breth0 src 172.18.0.2
10.96.0.0/16 via 169.254.169.4 dev breth0

6. Login to the ovn-worker node and run a python client (modify service IP):
import socket
import sys

s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
send_data = "echo n" + "o" * 1400
send_work = "echo blah"
service_ip = "10.96.165.251"
service_port = 1337
s.sendto(send_work.encode('utf-8'), (service_ip, service_port))
###^this will work and send a small packet
data, address = s.recvfrom(4096)
s.sendto(send_data.encode('utf-8'), (service_ip, service_port))
###^this will send the large packet which will trigger ICMP needs frag

Actual results:
In tcpdump I only see the single large packet sent and no ICMP needs frag:

[root@ovn-worker ~]# tcpdump -i any port 1337 or icmp  -nneev
tcpdump: data link type LINUX_SLL2
dropped privs to tcpdump
tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
17:27:35.929295 breth0 Out ifindex 6 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 57: (tos 0x0, ttl 64, id 20241, offset 0, flags [DF], proto UDP (17), length 37)
    169.254.169.2.34358 > 10.96.165.251.1337: UDP, length 9
17:27:35.929980 171f3e0078841b7 Out ifindex 10 0a:58:0a:f4:00:01 ethertype IPv4 (0x0800), length 57: (tos 0x0, ttl 62, id 20241, offset 0, flags [DF], proto UDP (17), length 37)
    100.64.0.2.34358 > 10.244.0.6.1337: UDP, length 9
17:27:35.930179 171f3e0078841b7 P   ifindex 10 0a:58:0a:f4:00:06 ethertype IPv4 (0x0800), length 64: (tos 0x0, ttl 64, id 6994, offset 0, flags [DF], proto UDP (17), length 44)
    10.244.0.6.1337 > 100.64.0.2.34358: UDP, length 16
17:27:35.930524 breth0 In  ifindex 6 0a:58:a9:fe:a9:04 ethertype IPv4 (0x0800), length 64: (tos 0x0, ttl 62, id 6994, offset 0, flags [DF], proto UDP (17), length 44)
    10.96.165.251.1337 > 169.254.169.2.34358: UDP, length 16

^ Small packet works fine


17:28:21.861554 breth0 Out ifindex 6 02:42:ac:12:00:02 ethertype IPv4 (0x0800), length 1454: (tos 0x0, ttl 64, id 21977, offset 0, flags [DF], proto UDP (17), length 1434)
    169.254.169.2.34358 > 10.96.165.251.1337: UDP, length 1406

^nothing after sending large packet


Expected results:
Should get an ICMP needs frag back to my host.

--- Additional comment from Tim Rozet on 2023-02-14 20:54:21 UTC ---

After some live debugging with Dumitru, we noticed the MACs do not look correct in the output of:

2023-02-14T17:29:49.506Z|00004|pinctrl(ovn_pinctrl0)|DBG|pinctrl received  packet-in | opcode=ICMP4_ERROR| OF_Table_ID=27| OF_Cookie_ID=0xde8a2ac| in-port=3| src-mac=0a:58:64:40:00:02, dst-mac=0a:58:64:40:00:01| src-ip=169.254.169.2, dst-ip=10.244.0.6

Dumitru found that this is because this log actually prints the packet after it is modified to be sent back out, rather than the original packet_in. This should also be fixed. Additionally the MAC 0a:58:64:40:00:02 belongs to ovn_cluster_router (the next hop after routing in the GR). Looking at an ovn-trace:

12. lr_in_ip_routing_pre (northd.c:11927): 1, priority 0, uuid d6dd449c
    reg7 = 0;
    next;
13. lr_in_ip_routing (northd.c:10317): reg7 == 0 && ip4.dst == 10.244.0.0/16, priority 49, uuid 049e28ea
    ip.ttl--;
    reg8[0..15] = 0;
    reg0 = 100.64.0.1;
    reg1 = 100.64.0.2;
    eth.src = 0a:58:64:40:00:02;
    outport = "rtoj-GR_ovn-worker";
    flags.loopback = 1;
    next;
14. lr_in_ip_routing_ecmp (northd.c:12005): reg8[0..15] == 0, priority 150, uuid b1ae4386
    next;
15. lr_in_policy (northd.c:12172): 1, priority 0, uuid 41142d7b
    reg8[0..15] = 0;
    next;
16. lr_in_policy_ecmp (northd.c:12174): reg8[0..15] == 0, priority 150, uuid 7e839f56
    next;
17. lr_in_arp_resolve (northd.c:12209): ip4, priority 1, uuid ad2b17bc
    get_arp(outport, reg0);
    /* MAC binding to 0a:58:64:40:00:01. */
    next;
18. lr_in_chk_pkt_len (northd.c:11532): outport == "rtoj-GR_ovn-worker", priority 50, uuid bb1f45da
    reg9[1] = check_pkt_larger(1414);
    next;
21. lr_in_arp_request (northd.c:12908): 1, priority 0, uuid bf4acee0
    output;

Routing and then setting the next hop mac occur before checking if the packet would be too large to forward and punting to controller. This seems like another bug as the next hop mac should not be set because this packet will not be forwarded to the destination, and ovn-controller is using this packet to determine the reply mac. Finally, Dumitru pointed out that the OVN flow resets the dst mac to be itself, which also does not make much sense:


actions             : "icmp4_error {reg9[0] = 1; reg9[1] = 0; eth.dst = 02:42:ac:12:00:02; ip4.dst = ip4.src; ip4.src = 172.18.0.2; ip.ttl = 255; icmp4.type = 3; /* Destination Unreachable. */ icmp4.code = 4; /* Frag Needed and DF was Set. */ icmp4.frag_mtu = 1400; next(pipeline=ingress, table=0); };"


        ds_put_format(actions,
            "icmp4_error {"
            REGBIT_EGRESS_LOOPBACK" = 1; "
            REGBIT_PKT_LARGER" = 0; "
            "eth.dst = %s; "
            "ip4.dst = ip4.src; "
            "ip4.src = %s; "
            "ip.ttl = 255; "
            "icmp4.type = 3; /* Destination Unreachable. */ "
            "icmp4.code = 4; /* Frag Needed and DF was Set. */ "
            "icmp4.frag_mtu = %d; "
            "next(pipeline=ingress, table=%d); };",
            op->lrp_networks.ea_s,
            op->lrp_networks.ipv4_addrs[0].addr_s,
            mtu, ovn_stage_get_table(S_ROUTER_IN_ADMISSION));
        ovn_lflow_add_with_hint__(lflows, op->od, stage, 150,
                                  ds_cstr(match), ds_cstr(actions),
                                  NULL,
                                  copp_meter_get(
                                        COPP_ICMP4_ERR,
                                        op->od->nbr->copp,
                                        meter_groups),
                                  &op->nbrp->header_);

--- Additional comment from Tim Rozet on 2023-02-14 21:04:00 UTC ---



--- Additional comment from Tim Rozet on 2023-02-15 20:00:37 UTC ---

Simulating a trace with the same packet out:

[root@ovn-worker openvswitch]# ovs-appctl ofproto/trace-packet-out br-int in_port=controller,icmp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.0.6,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=3,icmp_code=1 'set_field:0x64400001->reg0,set_field:0x64400002->reg1,set_field:0x5390006->reg9,set_field:0x49->reg10,set_field:0x8->reg11,set_field:0x2->reg14,set_field:0x1->reg15,set_field:0x7->metadata,set_field:0x1/0x1->xreg4,set_field:0/0x2->xreg4,set_field:02:42:ac:12:00:02->eth_dst,move:NXM_OF_IP_SRC[]->NXM_OF_IP_DST[],set_field:172.18.0.2->ip_src,set_field:255->nw_ttl,set_field:3->icmp_type,set_field:4->icmp_code,controller(userdata=00.00.00.0d.00.00.00.00.05.78,pause,meter_id=2),resubmit(,8)'
Flow: icmp,in_port=CONTROLLER,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.0.6,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=3,icmp_code=1

bridge("br-int")
----------------
    set_field:0x64400001->reg0
    set_field:0x64400002->reg1
    set_field:0x5390006->reg9
    set_field:0x49->reg10
    set_field:0x8->reg11
    set_field:0x2->reg14
    set_field:0x1->reg15
    set_field:0x7->metadata
    set_field:0x1/0x1->xreg4
    set_field:0/0x2->xreg4
    set_field:02:42:ac:12:00:02->eth_dst
    move:NXM_OF_IP_SRC[]->NXM_OF_IP_DST[]
     -> NXM_OF_IP_DST[] is now 169.254.169.2
    set_field:172.18.0.2->ip_src
    set_field:255->nw_ttl
    set_field:3->icmp_type
    set_field:4->icmp_code
    controller(userdata=00.00.00.0d.00.00.00.00.05.78,pause,meter_id=2)

Final flow: icmp,reg0=0x64400001,reg1=0x64400002,reg9=0x5390005,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=CONTROLLER,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=02:42:ac:12:00:02,nw_src=172.18.0.2,nw_dst=169.254.169.2,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=3,icmp_code=4
Megaflow: recirc_id=0,eth,icmp,in_port=CONTROLLER,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.0.6,nw_ttl=255,nw_frag=no,icmp_type=0x3/0xff,icmp_code=0x1/0xff
Datapath actions: set(eth(dst=02:42:ac:12:00:02)),set(ipv4(src=172.18.0.2,dst=169.254.169.2)),set(icmp(type=3,code=4))
This flow is handled by the userspace slow path because it:
  - Uses action(s) not supported by datapath.

--- Additional comment from Tim Rozet on 2023-02-17 14:38:21 UTC ---

vconn logs:

023-02-15T19:43:58.774Z|7013790|vconn|DBG|unix#556308: received: OFPT_PACKET_OUT (OF1.5) (xid=0x1a27): in_port=CONTROLLER actions=set_field:0x64400001->reg0,set_field:0x64400002->reg1,set_field:0x5390006->reg9,set_field:0x49->reg10,set_field:0x8->reg11,set_field:0x2->reg14,set_field:0x1->reg15,set_field:0x7->metadata,set_field:0x1/0x1->xreg4,set_field:0/0x2->xreg4,set_field:02:42:ac:12:00:02->eth_dst,move:NXM_OF_IP_SRC[]->NXM_OF_IP_DST[],set_field:172.18.0.2->ip_src,set_field:255->nw_ttl,set_field:3->icmp_type,set_field:4->icmp_code,controller(userdata=00.00.00.0d.00.00.00.00.05.78,pause,meter_id=2),resubmit(,8) data_len=576
icmp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.0.6,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=3,icmp_code=1 icmp_csum:4273

2023-02-15T19:43:58.775Z|7013791|vconn|DBG|unix#556308: sent (Success): NXT_PACKET_IN2 (OF1.5) (xid=0x0): total_len=576 reg0=0x64400001,reg1=0x64400002,reg9=0x5390005,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=CONTROLLER (via action) data_len=576 (unbuffered)
 userdata=00.00.00.0d.00.00.00.00.05.78
 continuation.bridge=2c7f8bd4-501c-42ee-8607-7be74fc19d13
 continuation.actions=unroll_xlate(table=0, cookie=0),resubmit(,8)
 continuation.odp_port=4294967295
icmp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=02:42:ac:12:00:02,nw_src=172.18.0.2,nw_dst=169.254.169.2,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=3,icmp_code=4 icmp_csum:4270

2023-02-15T19:43:58.775Z|7013792|vconn|DBG|unix#556308: received: NXT_RESUME (OF1.5) (xid=0x0): total_len=576 reg0=0x64400001,reg1=0x64400002,reg9=0x5390005,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=CONTROLLER (via action) data_len=576 (unbuffered)
 continuation.bridge=2c7f8bd4-501c-42ee-8607-7be74fc19d13
 continuation.actions=unroll_xlate(table=0, cookie=0),resubmit(,8)
 continuation.odp_port=4294967295
icmp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=02:42:ac:12:00:02,nw_src=172.18.0.2,nw_dst=169.254.169.2,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=3,icmp_code=4 icmp_csum:3cf8


2023-02-15T19:43:58.775Z|00001|ofproto_dpif_upcall(handler2)|INFO|received packet on unassociated datapath port 4294967295 (no OpenFlow port for datapath port 4294967295)
2023-02-15T19:43:59.102Z|00011|ofproto_dpif_upcall(revalidator10)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:ceca1501-c11d-4ccb-8fd2-446514f32fa8

--- Additional comment from Tim Rozet on 2023-02-17 14:41:19 UTC ---

missed the first vconn log:

023-02-15T19:43:58.774Z|7013789|vconn|DBG|unix#556308: sent (Success): NXT_PACKET_IN2 (OF1.5) (xid=0x0): table_id=27 cookie=0xde8a2ac total_len=1448 ct_state=new|trk|snat|dnat,ct_zone=8,ct_mark=0xa,ct_nw_src=169.254.169.2,ct_nw_dst=10.96.165.251,ct_nw_proto=17,ct_tp_src=52159,ct_tp_dst=1337,ip,reg0=0x64400001,reg1=0x64400002,reg9=0x5390006,reg10=0x49,reg11=0x8,reg14=0x2,reg15=0x1,metadata=0x7,in_port=3 (via action) data_len=1448 (unbuffered)
 userdata=00.00.00.0e.00.00.00.00.00.19.00.18.80.01.09.10.00.00.00.00.00.00.00.01.00.00.00.00.00.00.00.01.00.19.00.18.80.01.09.10.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.02.00.19.00.10.80.00.06.06.02.42.ac.12.00.02.00.00.00.1c.00.18.00.20.00.00.00.00.00.00.80.00.16.04.80.00.18.04.00.00.00.00.00.19.00.10.80.00.16.04.ac.12.00.02.00.00.00.00.00.19.00.10.00.01.3a.01.ff.00.00.00.00.00.00.00.00.19.00.10.80.00.26.01.03.00.00.00.00.00.00.00.00.19.00.10.80.00.28.01.04.00.00.00.00.00.00.00.ff.ff.00.30.00.00.23.20.00.25.00.00.00.00.00.00.00.03.00.0e.00.00.00.0d.00.00.00.00.05.78.00.00.00.04.00.04.00.00.00.00.00.05.00.08.00.00.00.02.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.08.00.00.00
udp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:02,dl_dst=0a:58:64:40:00:01,nw_src=169.254.169.2,nw_dst=10.244.0.6,nw_tos=0,nw_ecn=0,nw_ttl=63,nw_frag=no,tp_src=52159,tp_dst=1337 udp_csum:22f9

--- Additional comment from Tim Rozet on 2023-02-17 16:29:13 UTC ---

OK, so after some further investigation with Lorenzo this is the outcome:

1. This problem looks to be an OVS bug. We have not exactly identified what triggers it, but we think it may have something do with traffic originating from the LOCAL port. I'm going to clone this bz to the OVS team. OVS error:
2023-02-15T19:43:58.775Z|00001|ofproto_dpif_upcall(handler2)|INFO|received packet on unassociated datapath port 4294967295 (no OpenFlow port for datapath port 4294967295)
2023-02-15T19:43:59.102Z|00011|ofproto_dpif_upcall(revalidator10)|WARN|Failed to acquire udpif_key corresponding to unexpected flow (Invalid argument): ufid:ceca1501-c11d-4ccb-8fd2-446514f32fa8

2. This BZ will remain open to track 2 issues:
  a. Lorenzo and Dumitru uncovered that when fragmented packets are sent to an OVN load balancer, only the first fragment (with the UDP dest port) makes it to the destination. The other packet does not.
  b. OVN team to investigate if the method of using 2 PACKET_INs to handle an ICMP needs frag could/should be reduced to 1.

Comment 1 Tim Rozet 2023-04-19 13:35:31 UTC
Paolo, is there any update on this bug or when you will start working on it?

Comment 2 Tim Rozet 2023-04-27 19:26:15 UTC
Paolo provided a patch which fixes the issue:

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index c01177718..23f510ffb 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -1532,7 +1532,8 @@ xlate_lookup_ofproto_(const struct dpif_backer *backer,
         }
 
         ofp_port_t in_port = recirc_id_node->state.metadata.in_port;
-        if (in_port != OFPP_NONE && in_port != OFPP_CONTROLLER) {
+        if (in_port != OFPP_NONE && in_port != OFPP_CONTROLLER &&
+            !uuid_is_zero(&recirc_id_node->state.xport_uuid)) {
             struct uuid xport_uuid = recirc_id_node->state.xport_uuid;
             xport = xport_lookup_by_uuid(xcfg, &xport_uuid);
             if (xport && xport->xbridge && xport->xbridge->ofproto) {

Comment 4 Paolo Valerio 2023-04-28 10:42:38 UTC
After some tests on both ovn and ovs, the problem got reproduced in a simpler setup involving
ovs only.
It is observable when the packet originating from the controller toward br1 gets then sent 
to a patch port and then ends up recirculating before proceeding in the pipeline.
Continuation and local ports don't seem to be relevant to trigger the issue.
As Tim wrote in the previous comment, a quick patch got tested.

Comment 5 Paolo Valerio 2023-04-28 13:16:13 UTC
(In reply to Paolo Valerio from comment #4)
> After some tests on both ovn and ovs, the problem got reproduced in a
> simpler setup involving
> ovs only.
> It is observable when the packet originating from the controller toward br1
> gets then sent 
> to a patch port and then ends up recirculating before proceeding in the
> pipeline.
> Continuation and local ports don't seem to be relevant to trigger the issue.
> As Tim wrote in the previous comment, a quick patch got tested.

for completeness, the patch, aimed to confirm and root cause the issue, got tested 
on both the ovs (affected setup) and ovn-fake-multinode (not affected setup) Lorenzo
and I worked on in order to reproduce the issue.

Comment 6 Tim Rozet 2023-05-16 13:05:23 UTC
Paolo, do you mind providing a link to the posted fix? Thanks.

Comment 7 Paolo Valerio 2023-05-16 16:08:04 UTC
(In reply to Tim Rozet from comment #6)
> Paolo, do you mind providing a link to the posted fix? Thanks.

Sure, here's the link to the proposed fix:

https://patchwork.ozlabs.org/project/openvswitch/patch/168416414918.7273.15419064875718310114.stgit@rawp/

Comment 8 William Zhao 2023-05-18 12:05:23 UTC
Hi Paolo, do you know if you this patch will be backported to OCP 4.12?

Comment 9 Paolo Valerio 2023-05-22 16:36:21 UTC
(In reply to William Zhao from comment #8)
> Hi Paolo, do you know if you this patch will be backported to OCP 4.12?

Hi William, can't say for sure here as it depends on upstream. 
If backporting there works out, we will provide the updated downstream for ocp 4.12.

Comment 10 Tim Rozet 2023-09-05 15:03:48 UTC
Any update on this BZ or ETA on when it will land?

Comment 11 Paolo Valerio 2023-09-05 18:51:46 UTC
The next version is under test. I'll post a new revision soon.

Comment 13 Paolo Valerio 2023-11-14 18:32:11 UTC
Fix included in:

https://errata.devel.redhat.com/advisory/120352


Note You need to log in before you can comment on or make changes to this bug.