Bug 2099811
Summary: | UDP Packet loss in OpenShift using IPv6 [upcall] | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Flavio Leitner <fleitner> | |
Component: | Networking | Assignee: | Flavio Leitner <fleitner> | |
Networking sub component: | ovn-kubernetes | QA Contact: | Weibin Liang <weliang> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | bbennett, dan, dcbw, eglottma, jmaxwell, weliang, zzhao | |
Version: | 4.10 | |||
Target Milestone: | --- | |||
Target Release: | 4.11.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2099812 (view as bug list) | Environment: | ||
Last Closed: | 2022-08-10 11:18:57 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2099812 |
Description
Flavio Leitner
2022-06-21 18:22:48 UTC
This fix has been part of OCP 4.11 since mid-February 2022 via https://github.com/openshift/os/pull/715 Test 4.11.0-0.nightly-2022-06-25-132614 in both packet dualstack and cluster-bot dualstack. Test results show 0 packet lost in packet dualstack after kernal tunning, and packets lost from 1.7% to 0.068% in cluster-bot dualstack cluster after kernal tunning. #### Packets baremetal cluster without kernal tunning / $ iperf3 -V -f m -l 265 -b 200m -c fd01:0:0:5::f -u -i l -t 600 -p 6666 iperf 3.10.1 Linux test-pod2 4.18.0-372.9.1.el8.x86_64 #1 SMP Fri Apr 15 22:12:19 EDT 2022 x86_64 Control connection MSS 1328 Time: Tue, 28 Jun 2022 13:46:53 UTC Connecting to host fd01:0:0:5::f, port 6666 Cookie: lp5azkzqcwak7ujjl55i3mxwqh4ljlrpv44x Target Bitrate: 200000000 [ 5] local fd01:0:0:5::10 port 59553 connected to fd01:0:0:5::f port 6666 Starting Test: protocol: UDP, 1 streams, 265 byte blocks, omitting 0 seconds, 600 second test, tos 0 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 56603706 - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.000 ms 0/56603706 (0%) sender [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.001 ms 15840/56603706 (0.028%) receiver CPU Utilization: local/sender 16.9% (1.1%u/15.8%s), remote/receiver 12.4% (1.6%u/10.8%s) iperf Done. / $ #### Packets baremetal cluster with kernal tunning / $ iperf3 -V -f m -l 265 -b 200m -c fd01:0:0:5::f -u -i l -t 600 -p 6666 iperf 3.10.1 Linux test-pod2 4.18.0-372.9.1.el8.x86_64 #1 SMP Fri Apr 15 22:12:19 EDT 2022 x86_64 Control connection MSS 1328 Time: Tue, 28 Jun 2022 14:05:45 UTC Connecting to host fd01:0:0:5::f, port 6666 Cookie: tnmi6xfsq2qk2dmydnxhlg5q63idz4y44rqp Target Bitrate: 200000000 [ 5] local fd01:0:0:5::10 port 34628 connected to fd01:0:0:5::f port 6666 Starting Test: protocol: UDP, 1 streams, 265 byte blocks, omitting 0 seconds, 600 second test, tos 0 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 56603701 - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.000 ms 0/56603701 (0%) sender [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.001 ms 0/56603701 (0%) receiver CPU Utilization: local/sender 20.1% (2.8%u/17.3%s), remote/receiver 13.3% (2.4%u/10.9%s) iperf Done. / $ #### cluster-bot baremetal cluster without kernal tunning / $ iperf3 -V -f m -l 265 -b 200m -c fd01:0:0:5::10 -u -i l -t 600 -p 59553 iperf 3.10.1 Linux test-pod2 4.18.0-372.9.1.el8.x86_64 #1 SMP Fri Apr 15 22:12:19 EDT 2022 x86_64 Control connection MSS 1328 Time: Tue, 28 Jun 2022 13:50:59 UTC Connecting to host fd01:0:0:5::10, port 59553 Cookie: 5oipxhgympitohjzg6fabedtbqydaoef6lqb Target Bitrate: 200000000 [ 5] local fd01:0:0:5::11 port 41243 connected to fd01:0:0:5::10 port 59553 Starting Test: protocol: UDP, 1 streams, 265 byte blocks, omitting 0 seconds, 600 second test, tos 0 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 56603702 - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.000 ms 0/56603702 (0%) sender [ 5] 0.00-600.00 sec 13.7 GBytes 197 Mbits/sec 0.005 ms 966313/56603702 (1.7%) receiver CPU Utilization: local/sender 38.0% (5.2%u/32.7%s), remote/receiver 22.6% (3.8%u/18.8%s) iperf Done. / $ #### cluster-bot baremetal cluster with kernal tunning / $ iperf3 -V -f m -l 265 -b 200m -c fd01:0:0:5::10 -u -i l -t 600 -p 59553 iperf 3.10.1 Linux test-pod2 4.18.0-372.9.1.el8.x86_64 #1 SMP Fri Apr 15 22:12:19 EDT 2022 x86_64 Control connection MSS 1328 Time: Tue, 28 Jun 2022 14:05:23 UTC Connecting to host fd01:0:0:5::10, port 59553 Cookie: uvrhzzmb24iz3rok6xzapgdny7qxaarmbktk Target Bitrate: 200000000 [ 5] local fd01:0:0:5::11 port 51660 connected to fd01:0:0:5::10 port 59553 Starting Test: protocol: UDP, 1 streams, 265 byte blocks, omitting 0 seconds, 600 second test, tos 0 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 56603753 - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.000 ms 0/56603753 (0%) sender [ 5] 0.00-600.00 sec 14.0 GBytes 200 Mbits/sec 0.006 ms 38621/56603753 (0.068%) receiver CPU Utilization: local/sender 38.2% (5.5%u/32.7%s), remote/receiver 23.0% (4.0%u/19.0%s) iperf Done. / $ fleitner Based on above test results, do you think QE can verify this bug? Thanks! #### Packets baremetal cluster with kernal tunning sh-4.4# chroot /host sh-4.4# ovs-appctl dpctl/show system@ovs-system: lookups: hit:11614148 missed:199183 lost:0 flows: 540 masks: hit:43790003 total:63 hit/pkt:3.71 cache: hit:10035165 hit-rate:84.95% caches: masks-cache: size:256 port 0: ovs-system (internal) port 1: bond0 port 2: br-ex (internal) port 3: br-int (internal) port 4: genev_sys_6081 (geneve: packet_type=ptap) port 5: ovn-k8s-mp0 (internal) port 6: dd02eff4b1e74b8 port 7: 903f8301d63476b port 8: 389a796c3e5b235 port 9: 1e3140f1676b7f9 port 10: 629e2bc98da5ca3 port 11: 0cd33fefb7b42fa port 12: c37e94e0db04962 port 13: 671fe511889ca98 port 14: 98321308ab1b19f port 15: 02f42b06a0c6297 port 16: 24735518d1bf969 #### cluster-bot baremetal cluster with kernal tunning sh-4.4# chroot /host sh-4.4# ovs-appctl dpctl/show system@ovs-system: lookups: hit:12457282 missed:162303 lost:0 flows: 513 masks: hit:34936722 total:42 hit/pkt:2.77 cache: hit:11185592 hit-rate:88.64% caches: masks-cache: size:256 port 0: ovs-system (internal) port 1: enp2s0 port 2: br-ex (internal) port 3: br-int (internal) port 4: genev_sys_6081 (geneve: packet_type=ptap) port 5: ovn-k8s-mp0 (internal) port 6: 087daf35aba40c9 port 7: cc3ba85ac25c0ab port 8: c66f6e768f1c664 port 9: be4ebadea1ff66f port 10: ad0b5d7affbda32 port 11: 397a290983455fe port 12: e9a7b69f81f18c6 port 13: bb1f5a2bf3d1d0a port 14: 8e2654d28a35212 port 15: 7792cd2dfde1bdd port 16: 17719f3b5b7dd71 port 17: dc3097a7e562f6d sh-4.4# Tested on cluster-bot dualstack cluster, see the iperf3 throughput improvement in 4.11 comparing to 4.9.12 #### cluster-bot dualstack cluster 4.11 Test three times: client: iperf3 -V -f m -l 265 -b 200m -c fd01:0:0:6::8 -u -i l -t 600 -p 59554 [ 5] 0.00-600.00 sec 13.9 GBytes 199 Mbits/sec 0.005 ms 343838/56603689 (0.61%) receiver [ 5] 0.00-600.00 sec 13.9 GBytes 199 Mbits/sec 0.004 ms 308788/56603746 (0.55%) receiver [ 5] 0.00-600.00 sec 13.9 GBytes 199 Mbits/sec 0.005 ms 310015/56603745 (0.55%) receiver #### cluster-bot dualstack cluster 4.9.12 Test three times: client: iperf3 -V -f m -l 265 -b 200m -c fd01:0:0:6::8 -u -i l -t 600 -p 59554 [ 5] 0.00-600.00 sec 13.7 GBytes 197 Mbits/sec 0.007 ms 948533/56591795 (1.7%) receiver [ 5] 0.00-600.00 sec 13.8 GBytes 197 Mbits/sec 0.007 ms 835952/56603726 (1.5%) receiver [ 5] 0.00-600.00 sec 13.8 GBytes 197 Mbits/sec 0.013 ms 822906/56603660 (1.5%) receiver Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |