Bug 2160686
| Summary: | [ovs-dpdk-bond]L4 connection failed with ovs-dpdk | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | mhou <mhou> |
| Component: | openvswitch | Assignee: | Mike Pattrick <mpattric> |
| openvswitch sub component: | ovs-dpdk | QA Contact: | mhou <mhou> |
| Status: | CLOSED EOL | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | ctrautma, fleitner, jhsiao, ktraynor, mhou, mpattric |
| Version: | RHEL 9.0 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-10-08 17:49:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Hello,
My initial thought is this could be an L3 checksum issue, that would explain why ICMP can pass but TCP can't. Is the setup still available? If so, could you grab a pcap?
Running the following while conducting the test should help clear that up:
> # On hp-dl388g10-03
> ovs-tcpdump --span -i bondbridge -w bondbridge.pcap &
> ovs-tcpdump --span -i guestbridge -w guestbridge.pcap &
A tcpdump from hp-dl388g10-02 would also be helpful.
If your setup is no longer available then I can try to reproduce this issue.
Hello Michael I need wait current test finished. I can give you the results tomorrow. Hello Michael Found the TCP retrasmission when capture bondbridge and guestbridge. I upload all of pcap failes on attachment. Thanks! This confirms that it is an L4 checksum issue. I'll investigate further. Hello Michael I found a document to describe this issue. https://access.redhat.com/solutions/3964031 Once I disable tx off on container side. Netperf can work as well. I think the current issue is inherited from https://bugzilla.redhat.com/show_bug.cgi?id=1685616#c5 [root@hp-dl388g10-03 ~]# ovs-vsctl show 04d6f2db-7723-49ca-8236-bdd535022bb5 Bridge bondbridge datapath_type: netdev Port active-backup Interface ens4f1 type: dpdk options: {dpdk-devargs="0000:af:00.1", n_rxq="4"} Interface ens4f0 type: dpdk options: {dpdk-devargs="0000:af:00.0", n_rxq="4"} Port bondbridge Interface bondbridge type: internal Port patchbond Interface patchbond type: patch options: {peer=patchguest} Bridge guestbridge datapath_type: netdev Port c51ae05c88fe4_l Interface c51ae05c88fe4_l Port "488a9bac51db4_l" Interface "488a9bac51db4_l" Port guestbridge Interface guestbridge type: internal Port patchguest Interface patchguest type: patch options: {peer=patchbond} ovs_version: "3.1.1" [root@41f1c92819a5 /]# ethtool -K eth1 tx off Actual changes: tx-checksum-ip-generic: off tx-tcp-segmentation: off [not requested] tx-tcp-ecn-segmentation: off [not requested] tx-tcp-mangleid-segmentation: off [not requested] tx-tcp6-segmentation: off [not requested] tx-checksum-sctp: off [root@41f1c92819a5 /]# netperf -H 172.31.152.1 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.31.152.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.00 4120.48 Good find, that seems about right. I am currently working on a patch set to improve the handling of checksum offloaded interfaces which may improve this behavior. Do you want to keep this ticket open? Or close for now as it's a duplicate of a previous ticket? I tend to keep it for tracking subsequent updates (if any update) Mike, what is the next step for this ticket? As pointed out above, this is a known issue. The primary problem is if userspace TSO is disabled, netdev-linux interfaces don't receive the vnet header, and therefore don't get important checksum offload metadata. The "TCP checksum issues when using kernel space OVS with netdev datapath in Red Hat OpenStack Platform" solution does provide a workaround, and enabling userspace tso should also fix this issue. It's unlikely that a client would run into this issue due to the bundled nature of OVS. We could also modify netdev-linux to always turn on vnet headers and disable TSO if userspace tso is disabled. But that would be a big change. Want me to add that to the roadmap? This bug did not meet the criteria for automatic migration and is being closed. If the issue remains, please open a new ticket in https://issues.redhat.com/browse/FDP |
Description of problem: use ncat or netperf can't connect to peer side. Version-Release number of selected component (if applicable): kernel version:5.14.0-231.el9.x86_64 openvswitch: openvswitch2.17-2.17.0-52.el9fdp.x86_64 How reproducible: 100% Steps to Reproduce: 1. build ovs topo as below: driverctl set-override 0000:af:00.0 vfio-pci driverctl set-override 0000:af:00.1 vfio-pci systemctl start openvswitch &>/dev/null ovs-vsctl set Open_vSwitch . other_config={} ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem='8192,8192' ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x800000000 ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0xf000000000 ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl --may-exist add-br bondbridge -- set bridge bondbridge datapath_type=netdev ovs-vsctl set int bondbridge mtu_request=9200 ovs-vsctl add-bond bondbridge balance-tcp ens4f1 ens4f0 lacp=active bond_mode=balance-tcp \ -- set Interface ens4f1 type=dpdk options:dpdk-devargs=0000:af:00.1 options:n_rxq=4 mtu_request=9200 \ -- set Interface ens4f0 type=dpdk options:dpdk-devargs=0000:af:00.0 options:n_rxq=4 mtu_request=9200 ovs-vsctl add-port bondbridge patchbond \ -- set Interface patchbond type=patch \ -- set Interface patchbond options:peer=patchguest mtu_request=9200 ovs-vsctl set int ens4f0 mtu_request=9200 ovs-vsctl set int ens4f1 mtu_request=9200 ovs-vsctl set int patchbond mtu_request=9200 ovs-ofctl mod-port bondbridge bondbridge up ovs-ofctl mod-port bondbridge ens4f1 up ovs-ofctl mod-port bondbridge ens4f0 up ovs-vsctl --may-exist add-br guestbridge -- set bridge bondbridge datapath_type=netdev ovs-vsctl --may-exist add-port guestbridge patchguest \ -- set Interface patchguest type=patch \ -- set Interface patchguest options:peer=patchbond mtu_request=9200 ovs-vsctl set int guestbridge mtu_request=9200 ovs-ofctl mod-port guestbridge guestbridge up ovs-vsctl --may-exist add-br guestbridge -- set bridge guestbridge datapath_type=netdev ovs-vsctl --may-exist add-port guestbridge patchguest -- set Interface patchguest type=patch -- set Interface patchguest options:peer=patchbond mtu_request=9200 ovs-vsctl set int guestbridge mtu_request=9200 ovs-ofctl mod-port guestbridge guestbridge up ovs-ofctl mod-port guestbridge patchguest up 2. add two containers to guest bridge # podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e42b1f94f696 localhost/rhel9.0_x86_64:latest sleep infinity 10 minutes ago Up 10 minutes ago g1 8201bd35a4c9 localhost/rhel9.0_x86_64:latest sleep infinity 10 minutes ago Up 10 minutes ago g2 ovs-podman add-port guestbridge eth1 g1 --ipaddress=172.31.152.42/24 --ip6address=2001:db8:152::42/64 --mtu=9200 --macaddress=00:de:ad:98:02:02 ovs-podman add-port guestbridge eth2 g2 --ipaddress=172.31.152.52/24 --ip6address=2001:db8:152::52/64 --mtu=9200 --macaddress=00:de:ad:98:02:12 # podman exec g1 ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if202: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 06:cb:30:99:df:bc brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.88.0.24/16 brd 10.88.255.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::4cb:30ff:fe99:dfbc/64 scope link valid_lft forever preferred_lft forever 208: eth1@if209: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9200 qdisc noqueue state UP group default qlen 1000 link/ether 00:de:ad:98:02:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.31.152.42/24 scope global eth1 valid_lft forever preferred_lft forever inet6 2001:db8:152::42/64 scope global valid_lft forever preferred_lft forever inet6 fe80::b8c9:8dff:fe71:7740/64 scope link valid_lft forever preferred_lft forever # podman exec g2 ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0@if203: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 1e:f9:c8:d2:91:e4 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.88.0.25/16 brd 10.88.255.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::1cf9:c8ff:fed2:91e4/64 scope link valid_lft forever preferred_lft forever 210: eth2@if211: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9200 qdisc noqueue state UP group default qlen 1000 link/ether 00:de:ad:98:02:12 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.31.152.52/24 scope global eth2 valid_lft forever preferred_lft forever inet6 2001:db8:152::52/64 scope global valid_lft forever preferred_lft forever inet6 fe80::c0a9:1aff:fe10:f0d7/64 scope link valid_lft forever preferred_lft forever # ovs-vsctl show 0b41afb4-a3b3-4752-ba38-136f192af156 Bridge bondbridge datapath_type: netdev Port patchbond Interface patchbond type: patch options: {peer=patchguest} Port balance-slb Interface ens4f1 type: dpdk options: {dpdk-devargs="0000:af:00.1", n_rxq="4"} Interface ens4f0 type: dpdk options: {dpdk-devargs="0000:af:00.0", n_rxq="4"} Port bondbridge Interface bondbridge type: internal Bridge guestbridge datapath_type: netdev Port patchguest Interface patchguest type: patch options: {peer=patchbond} Port guestbridge Interface guestbridge type: internal Port "1d25e098d1404_l" Interface "1d25e098d1404_l" Port e875a0b4e3f74_l Interface e875a0b4e3f74_l ovs_version: "2.17.4" [root@hp-dl388g10-03 ~]# ovs-appctl bond/show ---- balance-slb ---- bond_mode: balance-slb bond may use recirculation: no, Recirc-ID : -1 bond-hash-basis: 0 lb_output action: disabled, bond-id: -1 updelay: 0 ms downdelay: 0 ms next rebalance: 3104 ms lacp_status: negotiated lacp_fallback_ab: false active-backup primary: <none> active member mac: 3c:fd:fe:bd:1c:a5(ens4f1) member ens4f0: enabled may_enable: true member ens4f1: enabled active member may_enable: true 3. add NORMAL flow to all of bridge ovs-ofctl add-flow guestbridge actions=NORMAL ovs-ofctl add-flow bondbridge actions=NORMAL 4. ping from g1 to peer side [root@e42b1f94f696 /]# ping 172.31.152.1 -c 3 PING 172.31.152.1 (172.31.152.1) 56(84) bytes of data. 64 bytes from 172.31.152.1: icmp_seq=1 ttl=64 time=0.074 ms 64 bytes from 172.31.152.1: icmp_seq=2 ttl=64 time=0.060 ms 64 bytes from 172.31.152.1: icmp_seq=3 ttl=64 time=0.074 ms --- 172.31.152.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2078ms rtt min/avg/max/mdev = 0.060/0.069/0.074/0.006 ms 5. do netperf from g1 to peer side [root@e42b1f94f696 /]# netperf -4 -t TCP_STREAM -H 172.31.152.1 -l 10 establish control: are you sure there is a netserver listening on 172.31.152.1 at port 12865? establish_control could not establish the control connection from 0.0.0.0 port 0 address family AF_INET to 172.31.152.1 port 12865 address family AF_INET [root@e42b1f94f696 /]# netperf -4 -t UDP_STREAM -H 172.31.152.1 -l 10 establish control: are you sure there is a netserver listening on 172.31.152.1 at port 12865? establish_control could not establish the control connection from 0.0.0.0 port 0 address family AF_INET to 172.31.152.1 port 12865 address family AF_INET [root@hp-dl388g10-03 ovs_bond_function]# ovs-appctl dpctl/dump-flows -m flow-dump from the main thread: ufid:469d131a-a66d-4738-b6cf-8756ab2096a0, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(1d25e098d1404_l),packet_type(ns=0,id=0),eth(src=00:de:ad:98:02:02,dst=40:a6:b7:3e:a5:60),eth_type(0x0800),ipv4(src=172.31.152.42/0.0.0.0,dst=172.31.152.1/0.0.0.0,proto=6/0,tos=0/0,ttl=64/0,frag=no),tcp(src=44629/0,dst=12865/0),tcp_flags(0/0), packets:4, bytes:296, used:2.685s, flags:S, dp:ovs, actions:ens4f1, dp-extra-info:miniflow_bits(5,1) flow-dump from pmd on cpu core: 39 ufid:7a0760d7-64a1-466f-9cdf-1bb2e20ab66d, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens4f0),packet_type(ns=0,id=0),eth(src=b0:c5:3c:f6:36:d4,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:146, bytes:40442, used:4.314s, dp:ovs, actions:drop, dp-extra-info:miniflow_bits(5,0) flow-dump from pmd on cpu core: 36 ufid:269a2a3d-0e3e-4931-98b5-ddddffbb7835, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens4f1),packet_type(ns=0,id=0),eth(src=b0:c5:3c:f6:36:d5,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:149, bytes:41273, used:1.758s, dp:ovs, actions:drop, dp-extra-info:miniflow_bits(5,0) 6. check netserver start as well [root@hp-dl388g10-02 ovs_bond_function]# netstat -anltup Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN 1618/dnsmasq tcp 0 0 127.0.0.1:8081 0.0.0.0:* LISTEN 1942/restraintd tcp 0 0 0.0.0.0:4999 0.0.0.0:* LISTEN 3596027/nc tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1322/sshd: /usr/sbi tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd tcp 0 248 10.73.89.32:22 10.72.12.191:43530 ESTABLISHED 3593945/sshd: root tcp 0 0 10.73.89.32:942 10.73.130.89:2049 ESTABLISHED - tcp 0 0 10.73.89.32:22 10.72.12.191:43526 ESTABLISHED 3593904/sshd: root tcp6 0 0 ::1:8081 :::* LISTEN 1942/restraintd tcp6 0 0 :::12865 :::* LISTEN 3596043/netserver tcp6 0 0 :::4999 :::* LISTEN 3596027/nc tcp6 0 0 :::22 :::* LISTEN 1322/sshd: /usr/sbi tcp6 0 0 :::111 :::* LISTEN 1/systemd udp 0 0 192.168.122.1:53 0.0.0.0:* 1618/dnsmasq udp 0 0 0.0.0.0:67 0.0.0.0:* 1618/dnsmasq udp 0 0 10.73.89.32:68 10.73.2.108:67 ESTABLISHED 9673/NetworkManager udp 0 0 0.0.0.0:111 0.0.0.0:* 1/systemd udp 0 0 127.0.0.1:323 0.0.0.0:* 1273/chronyd udp6 0 0 :::111 :::* 1/systemd udp6 0 0 ::1:323 :::* 1273/chronyd Actual results: 1.Use netperf to do tcp/udp test failed Expected results: 2. netperf test should run as well Additional info: I also try to use nc command but still failed. peer side: [root@hp-dl388g10-02 ovs_bond_function]# nc -l 4999 container side: [root@e42b1f94f696 /]# nc 172.31.152.1 4999 Ncat: TIMEOUT.