Description of problem: This issue only occured on kernel-rt, when pre-configure mac-table-size and set-flow-limit, send traffic will trigger dump action. Version-Release number of selected component (if applicable): kernel version: 4.18.0-497.rt7.286.el8.x86_64 ovs version: openvswitch2.17-2.17.0-105.el8fdp.x86_64 How reproducible: 100% # ethtool -i ens1f0 driver: mlx5_core version: 4.18.0-497.rt7.286.el8.x86_64 firmware-version: 16.34.1002 (MT_0000000080) expansion-rom-version: bus-info: 0000:17:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes Steps to Reproduce: 0. enable realtime-virtual-host profile and keep cpus of core0(cpu0, cpu1) as housekeeping cpu. # cat /etc/tuned/realtime-virtual-host-variables.conf isolated_cores=2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55 isolate_managed_irq=Y # numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 node 0 size: 31214 MB node 0 free: 1255 MB node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 node 1 size: 31222 MB node 1 free: 2237 MB node distances: node 0 1 0: 10 20 1: 20 10 1. configure offload pvp test scenario setup_ovs_offload.sh ovsbr0 0000:17:00.0 --numvfs=1 --vf-macs='00:de:ad:01:01:01' --spoofchk=off --trust=on \ --hw-offload=on --tc-policy=none --steering-mode='smfs' setup_ovs_offload.sh ovsbr1 0000:17:00.1 --numvfs=1 --vf-macs='00:de:ad:02:02:02' --spoofchk=off --trust=on \ --hw-offload=on --tc-policy=none --steering-mode='smfs' # ovs-vsctl show d9d543af-7fc1-44f0-9c20-5a81e695c354 Bridge ovsbr1 Port ens1f1 Interface ens1f1 Port eth1 Interface eth1 Port ovsbr1 Interface ovsbr1 type: internal Bridge ovsbr0 Port ens1f0 Interface ens1f0 Port eth0 Interface eth0 Port ovsbr0 Interface ovsbr0 type: internal ovs_version: "2.17.7" current handler and revalidator as below # pidstat -t -p $(pidof ovs-vswitchd) Linux 4.18.0-497.rt7.286.el8.x86_64 (dell-per750-03.rhts.eng.pek2.redhat.com) 06/27/2023 _x86_64_ (56 CPU) 06:35:56 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 06:35:56 AM 994 13802 - 0.34 0.45 0.00 0.00 0.79 0 ovs-vswitchd 06:35:56 AM 994 - 13802 0.00 0.01 0.00 0.00 0.01 0 |__ovs-vswitchd 06:35:56 AM 994 - 13804 0.32 0.41 0.00 0.37 0.73 0 |__handler3 06:35:56 AM 994 - 13805 0.00 0.00 0.00 0.00 0.00 1 |__handler2 06:35:56 AM 994 - 13806 0.00 0.00 0.00 0.00 0.00 0 |__handler1 06:35:56 AM 994 - 13807 0.02 0.03 0.00 0.05 0.05 1 |__revalidator4 06:35:56 AM 994 - 13808 0.00 0.00 0.00 0.00 0.00 1 |__urcu5 ovsdb-tool show-log -m record 0: "Open_vSwitch" schema, version="8.3.0", cksum="3781850481 26690" record 1: 2023-06-26 15:36:14.712 "ovs-vsctl (invoked by /bin/sh): ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=8.3.0" table Open_vSwitch insert row a3d2ef9e: record 2: 2023-06-26 15:36:14.896 "ovs-vsctl (invoked by /bin/sh): ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.17.7 "external-ids:system-id=\"fd78e6ac-7940-4e77-812e-c761c72ad319\"" "external-ids:rundir=\"/var/run/openvswitch\"" "system-type=\"rhel\"" "system-version=\"8.9\""" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 3: 2023-06-26 15:36:14.906 "ovs-vsctl (invoked by /bin/sh): ovs-vsctl --no-wait add Open_vSwitch . external-ids hostname=dell-per750-03.rhts.eng.pek2.redhat.com" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 4: 2023-06-26 15:36:15.251 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 5: 2023-06-26 15:36:15.252 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 6: 2023-06-26 15:37:04.909 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl --may-exist add-br ovsbr0" table Interface insert row "ovsbr0" (da3692bf): table Port insert row "ovsbr0" (db051bf5): table Bridge insert row "ovsbr0" (197157d7): table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 7: 2023-06-26 15:37:04.950 table Interface row "ovsbr0" (da3692bf) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 8: 2023-06-26 15:37:04.951 table Bridge row "ovsbr0" (197157d7) diff: record 9: 2023-06-26 15:37:05.014 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl add-port ovsbr0 ens1f0" table Port insert row "ens1f0" (b6d21b34): table Interface insert row "ens1f0" (6ba2a06a): table Bridge row "ovsbr0" (197157d7) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 10: 2023-06-26 15:37:05.025 table Interface row "ens1f0" (6ba2a06a) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 11: 2023-06-26 15:37:05.281 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl add-port ovsbr0 eth0" table Port insert row "eth0" (70e89e5d): table Interface insert row "eth0" (b03fc2c1): table Bridge row "ovsbr0" (197157d7) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 12: 2023-06-26 15:37:05.287 table Interface row "eth0" (b03fc2c1) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 13: 2023-06-26 15:37:05.307 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set Open_vSwitch . other_config:hw-offload=false" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 14: 2023-06-26 15:37:05.308 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 15: 2023-06-26 15:37:05.315 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set Open_vSwitch . other_config:tc-policy=none" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 16: 2023-06-26 15:37:05.316 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 17: 2023-06-26 15:37:05.323 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set Open_vSwitch . other_config:max-idle=180000" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 18: 2023-06-26 15:37:05.324 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 19: 2023-06-26 15:37:50.968 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl --may-exist add-br ovsbr1" table Interface insert row "ovsbr1" (2ff16c11): table Port insert row "ovsbr1" (3f2c03d6): table Bridge insert row "ovsbr1" (58b87ac8): table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 20: 2023-06-26 15:37:50.979 table Interface row "ovsbr1" (2ff16c11) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 21: 2023-06-26 15:37:50.979 table Bridge row "ovsbr1" (58b87ac8) diff: record 22: 2023-06-26 15:37:51.043 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl add-port ovsbr1 ens1f1" table Interface insert row "ens1f1" (620277d0): table Port insert row "ens1f1" (74709db5): table Bridge row "ovsbr1" (58b87ac8) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 23: 2023-06-26 15:37:51.050 table Interface row "ens1f1" (620277d0) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 24: 2023-06-26 15:37:51.302 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl add-port ovsbr1 eth1" table Port insert row "eth1" (56dcd8b9): table Interface insert row "eth1" (8f08ceec): table Bridge row "ovsbr1" (58b87ac8) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 25: 2023-06-26 15:37:51.308 table Interface row "eth1" (8f08ceec) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 26: 2023-06-26 15:37:57.043 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set Open_vSwitch . other_config:max-idle=3600000" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 27: 2023-06-26 15:37:57.044 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 28: 2023-06-26 15:37:57.085 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set Open_vSwitch . other_config:max-revalidator=3600000" table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 29: 2023-06-26 15:37:57.086 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 30: 2023-06-26 15:37:57.127 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set bridge ovsbr0 other-config:mac-aging-time=3600" table Bridge row "ovsbr0" (197157d7) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 31: 2023-06-26 15:37:57.128 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 32: 2023-06-26 15:37:57.169 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set bridge ovsbr0 other-config:mac-table-size=48192" table Bridge row "ovsbr0" (197157d7) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 33: 2023-06-26 15:37:57.170 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 34: 2023-06-26 15:37:57.212 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set bridge ovsbr1 other-config:mac-aging-time=3600" table Bridge row "ovsbr1" (58b87ac8) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 35: 2023-06-26 15:37:57.212 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 36: 2023-06-26 15:37:57.253 "ovs-vsctl (invoked by /bin/bash): ovs-vsctl set bridge ovsbr1 other-config:mac-table-size=48192" table Bridge row "ovsbr1" (58b87ac8) diff: table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: record 37: 2023-06-26 15:37:57.254 table Open_vSwitch row a3d2ef9e (a3d2ef9e) diff: 2. adjust max-idle/ max-revalidator/ mac-aging-time/ mac-table-size ovs-vsctl set Open_vSwitch . other_config:max-idle=3600000 ovs-vsctl set Open_vSwitch . other_config:max-revalidator=3600000 ovs-vsctl set bridge ovsbr0 other-config:mac-aging-time=3600 ovs-vsctl set bridge ovsbr0 other-config:mac-table-size=48192 ovs-vsctl set bridge ovsbr1 other-config:mac-aging-time=3600 ovs-vsctl set bridge ovsbr1 other-config:mac-table-size=48192 ovs-appctl upcall/set-flow-limit 88192 # ovs-appctl upcall/show system@ovs-system: flows : (current 4) (avg 12808) (max 39241) (limit 88192) offloaded flows : 4 dump duration : 3ms ufid enabled : true 4: (keys 7) 3. send warmup traffic from peer site(back-to-back connection) ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=1 --max-loss-pct=0.002 --search-runtime=10 --validation-runtime=10 --rate-tolerance=10 --runtime-tolerance=10 --rate=25 --rate-unit=% --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-trial --warmup-trial-runtime=10 --rate=0.1 --rate-unit=mpps --one-shot=1 --src-macs=00:00:00:00:00:01,00:00:00:00:00:02 --dst-macs=00:00:00:00:00:02,00:00:00:00:00:01 --use-device-stats --vlan-ids=11,11 4. check ovs bridge already learn mac address # ovs-appctl dpctl/dump-flows recirc_id(0),in_port(3),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:01),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:1994698, bytes:17944303208, used:493.080s, actions:2 recirc_id(0),in_port(2),eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:02),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:1999999, bytes:17991987314, used:493.080s, actions:3 recirc_id(0),in_port(6),eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:02),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:1999205, bytes:17984783856, used:493.080s, actions:5 recirc_id(0),in_port(5),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:01),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:1998705, bytes:17980287656, used:493.080s, actions:6 5. send test traffic with init 6% ./binary-search.py --traffic-generator=trex-txrx --frame-size=9000 --num-flows=10000 --max-loss-pct=0.002 --search-runtime=10 --validation-runtime=10 --rate-tolerance=10 --runtime-tolerance=10 --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-trial --warmup-trial-runtime=10 --rate=6 --rate-unit=% --one-shot=0 --src-macs=10:00:00:00:00:00,20:00:00:00:00:00 --dst-macs=00:00:00:00:00:02,00:00:00:00:00:01 --use-src-ip-flows=0 --use-dst-ip-flows=0 --use-src-mac-flows=1 --use-dst-mac-flows=0 --warmup-trial --warmup-trial-runtime=30 --sniff-runtime=10 --warmup-trial --warmup-trial-runtime=10 --send-teaching-measurement --send-teaching-warmup --teaching-warmup-packet-type=generic --max-retries=10 --teaching-warmup-packet-rate=10000 --teaching-measurement-packet-type=generic --use-device-stats 6. monitor ovs automatically dump flows # ovs-appctl upcall/show system@ovs-system: flows : (current 4972) (avg 9647) (max 68854) (limit 4000) offloaded flows : 4972 dump duration : 33ms ufid enabled : true 4: (keys 4972) Actual results: 1. test failed with flow limit re-set to 4000 Expected results: 1. test should be pass Additional info: full of ovs-vswitchd.log as below 2023-06-27T09:25:35.622Z|00029|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:52.638Z|00030|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:53.624Z|00031|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:54.147Z|00032|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:54.524Z|00033|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:55.034Z|00034|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:55.215Z|00035|poll_loop(handler3)|INFO|Dropped 60 log messages in last 1598 seconds (most recently, 1598 seconds ago) due to excessive rate 2023-06-27T09:45:55.216Z|00036|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.220Z|00037|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.224Z|00038|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.230Z|00039|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.234Z|00040|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.240Z|00041|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.244Z|00042|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.248Z|00043|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.265Z|00044|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.297Z|00045|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (86% CPU usage) 2023-06-27T09:45:55.662Z|00046|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:57.538Z|00047|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:45:59.394Z|00001|ovs_rcu(urcu5)|WARN|blocked 1001 ms waiting for revalidator4 to quiesce 2023-06-27T09:46:00.394Z|00002|ovs_rcu(urcu5)|WARN|blocked 2001 ms waiting for revalidator4 to quiesce 2023-06-27T09:46:01.669Z|00116|ovs_rcu|WARN|blocked 1001 ms waiting for handler3 to quiesce 2023-06-27T09:46:02.394Z|00003|ovs_rcu(urcu5)|WARN|blocked 4002 ms waiting for revalidator4 to quiesce 2023-06-27T09:46:02.669Z|00117|ovs_rcu|WARN|blocked 2001 ms waiting for handler3 to quiesce 2023-06-27T09:46:03.647Z|00048|timeval(handler3)|WARN|Unreasonably long 5250ms poll interval (0ms user, 3ms system) 2023-06-27T09:46:03.647Z|00049|timeval(handler3)|WARN|faults: 14 minor, 0 major 2023-06-27T09:46:03.647Z|00050|timeval(handler3)|WARN|context switches: 16 voluntary, 54 involuntary 2023-06-27T09:46:03.647Z|00051|coverage(handler3)|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=3b0e3a51: 2023-06-27T09:46:03.647Z|00052|coverage(handler3)|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0058/sec total: 21 2023-06-27T09:46:03.647Z|00053|coverage(handler3)|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:03.647Z|00054|coverage(handler3)|INFO|ofproto_update_port 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:03.647Z|00055|coverage(handler3)|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:03.647Z|00056|coverage(handler3)|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:03.647Z|00057|coverage(handler3)|INFO|rev_mac_learning 0.0/sec 0.000/sec 0.0342/sec total: 125 2023-06-27T09:46:03.647Z|00058|coverage(handler3)|INFO|dumped_duplicate_flow 0.0/sec 0.000/sec 0.0019/sec total: 7 2023-06-27T09:46:03.647Z|00059|coverage(handler3)|INFO|handler_duplicate_upcall 1075.0/sec 103.833/sec 3.1158/sec total: 11667 2023-06-27T09:46:03.647Z|00060|coverage(handler3)|INFO|revalidate_missed_dp_flow 0.0/sec 0.000/sec 0.1072/sec total: 386 2023-06-27T09:46:03.647Z|00061|coverage(handler3)|INFO|upcall_flow_limit_hit 0.0/sec 0.000/sec 158.8639/sec total: 571910 2023-06-27T09:46:03.647Z|00062|coverage(handler3)|INFO|upcall_ukey_replace 1.0/sec 0.267/sec 0.2253/sec total: 811 2023-06-27T09:46:03.647Z|00063|coverage(handler3)|INFO|xlate_actions 8274.6/sec 958.033/sec 249.6086/sec total: 899615 2023-06-27T09:46:03.647Z|00064|coverage(handler3)|INFO|cmap_expand 180.8/sec 42.667/sec 1.4558/sec total: 5241 2023-06-27T09:46:03.647Z|00065|coverage(handler3)|INFO|cmap_shrink 0.0/sec 0.000/sec 0.7383/sec total: 2658 2023-06-27T09:46:03.647Z|00066|coverage(handler3)|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:03.647Z|00067|coverage(handler3)|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0828/sec total: 298 2023-06-27T09:46:03.647Z|00068|coverage(handler3)|INFO|dpif_flow_put 7199.6/sec 854.200/sec 35.8689/sec total: 129702 2023-06-27T09:46:03.647Z|00069|coverage(handler3)|INFO|dpif_flow_del 0.0/sec 0.000/sec 11.8478/sec total: 42652 2023-06-27T09:46:03.647Z|00070|coverage(handler3)|INFO|dpif_execute 8274.6/sec 958.033/sec 187.6797/sec total: 676671 2023-06-27T09:46:03.647Z|00071|coverage(handler3)|INFO|flow_extract 8274.6/sec 958.033/sec 187.6792/sec total: 676669 2023-06-27T09:46:03.647Z|00072|coverage(handler3)|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.1056/sec total: 380 2023-06-27T09:46:03.647Z|00073|coverage(handler3)|INFO|hmap_pathological 0.4/sec 0.033/sec 0.0167/sec total: 60 2023-06-27T09:46:03.647Z|00074|coverage(handler3)|INFO|hmap_expand 4.2/sec 1.450/sec 2.3581/sec total: 8541 2023-06-27T09:46:03.647Z|00075|coverage(handler3)|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0011/sec total: 4 2023-06-27T09:46:03.647Z|00076|coverage(handler3)|INFO|mac_learning_learned 7195.2/sec 853.550/sec 34.9944/sec total: 126554 2023-06-27T09:46:03.647Z|00077|coverage(handler3)|INFO|netdev_get_stats 1.2/sec 1.100/sec 0.7017/sec total: 2532 2023-06-27T09:46:03.647Z|00078|coverage(handler3)|INFO|txn_unchanged 0.0/sec 0.167/sec 0.1192/sec total: 429 2023-06-27T09:46:03.647Z|00079|coverage(handler3)|INFO|txn_incomplete 0.2/sec 0.017/sec 0.0125/sec total: 46 2023-06-27T09:46:03.647Z|00080|coverage(handler3)|INFO|txn_success 0.0/sec 0.000/sec 0.0103/sec total: 38 2023-06-27T09:46:03.647Z|00081|coverage(handler3)|INFO|poll_create_node 502.4/sec 61.717/sec 21.2772/sec total: 76702 2023-06-27T09:46:03.647Z|00082|coverage(handler3)|INFO|poll_zero_timeout 129.8/sec 15.183/sec 3.1792/sec total: 11462 2023-06-27T09:46:03.647Z|00083|coverage(handler3)|INFO|seq_change 308.2/sec 38.433/sec 26.0342/sec total: 93746 2023-06-27T09:46:03.647Z|00084|coverage(handler3)|INFO|pstream_open 0.0/sec 0.000/sec 0.0014/sec total: 5 2023-06-27T09:46:03.647Z|00085|coverage(handler3)|INFO|stream_open 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:03.647Z|00086|coverage(handler3)|INFO|unixctl_received 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:03.647Z|00087|coverage(handler3)|INFO|unixctl_replied 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:03.647Z|00088|coverage(handler3)|INFO|util_xalloc 62474.2/sec 7427.300/sec 1018.3594/sec total: 3673805 2023-06-27T09:46:03.647Z|00089|coverage(handler3)|INFO|netdev_set_policing 0.0/sec 0.000/sec 0.0350/sec total: 126 2023-06-27T09:46:03.647Z|00090|coverage(handler3)|INFO|netdev_get_ethtool 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:03.647Z|00091|coverage(handler3)|INFO|netlink_overflow 1.2/sec 0.133/sec 0.0053/sec total: 19 2023-06-27T09:46:03.647Z|00092|coverage(handler3)|INFO|netlink_received 15500.2/sec 1816.600/sec 279.7611/sec total: 1008748 2023-06-27T09:46:03.647Z|00093|coverage(handler3)|INFO|netlink_recv_jumbo 0.8/sec 15.267/sec 1.6794/sec total: 6050 2023-06-27T09:46:03.647Z|00094|coverage(handler3)|INFO|netlink_sent 15492.2/sec 1815.750/sec 278.7436/sec total: 1005085 2023-06-27T09:46:03.647Z|00095|coverage(handler3)|INFO|nln_changed 0.0/sec 0.000/sec 0.0083/sec total: 30 2023-06-27T09:46:03.647Z|00096|coverage(handler3)|INFO|109 events never hit 2023-06-27T09:46:05.275Z|00097|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:06.330Z|00098|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:07.830Z|00335|ofproto_dpif_upcall(revalidator4)|INFO|Spent an unreasonably long 12236ms dumping flows 2023-06-27T09:46:07.830Z|00336|timeval(revalidator4)|WARN|Unreasonably long 12236ms poll interval (1119ms user, 1670ms system) 2023-06-27T09:46:07.830Z|00337|timeval(revalidator4)|WARN|faults: 981 minor, 0 major 2023-06-27T09:46:07.830Z|00338|timeval(revalidator4)|WARN|context switches: 1140 voluntary, 23446 involuntary 2023-06-27T09:46:07.830Z|00339|coverage(revalidator4)|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=d57648a6: 2023-06-27T09:46:07.830Z|00340|coverage(revalidator4)|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0058/sec total: 21 2023-06-27T09:46:07.830Z|00341|coverage(revalidator4)|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:07.830Z|00342|coverage(revalidator4)|INFO|ofproto_update_port 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:07.830Z|00343|coverage(revalidator4)|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:07.831Z|00344|coverage(revalidator4)|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:07.831Z|00345|coverage(revalidator4)|INFO|rev_mac_learning 0.4/sec 0.033/sec 0.0347/sec total: 125 2023-06-27T09:46:07.831Z|00346|coverage(revalidator4)|INFO|dumped_duplicate_flow 0.0/sec 0.000/sec 0.0019/sec total: 155 2023-06-27T09:46:07.831Z|00347|coverage(revalidator4)|INFO|handler_duplicate_upcall 90.0/sec 111.333/sec 3.2408/sec total: 13202 2023-06-27T09:46:07.831Z|00348|coverage(revalidator4)|INFO|revalidate_missed_dp_flow 0.0/sec 0.000/sec 0.1072/sec total: 2067 2023-06-27T09:46:07.831Z|00349|coverage(revalidator4)|INFO|upcall_flow_limit_hit 0.0/sec 0.000/sec 158.8639/sec total: 571910 2023-06-27T09:46:07.831Z|00350|coverage(revalidator4)|INFO|upcall_ukey_replace 0.0/sec 0.267/sec 0.2253/sec total: 811 2023-06-27T09:46:07.831Z|00351|coverage(revalidator4)|INFO|xlate_actions 204.8/sec 975.100/sec 249.8931/sec total: 978953 2023-06-27T09:46:07.831Z|00352|coverage(revalidator4)|INFO|cmap_expand 0.0/sec 42.667/sec 1.4558/sec total: 5383 2023-06-27T09:46:07.831Z|00353|coverage(revalidator4)|INFO|cmap_shrink 0.0/sec 0.000/sec 0.7383/sec total: 2658 2023-06-27T09:46:07.831Z|00354|coverage(revalidator4)|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:07.831Z|00355|coverage(revalidator4)|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0828/sec total: 628 2023-06-27T09:46:07.831Z|00356|coverage(revalidator4)|INFO|dpif_flow_put 114.8/sec 863.767/sec 36.0283/sec total: 167237 2023-06-27T09:46:07.831Z|00357|coverage(revalidator4)|INFO|dpif_flow_del 0.0/sec 0.000/sec 11.8478/sec total: 42676 2023-06-27T09:46:07.831Z|00358|coverage(revalidator4)|INFO|dpif_execute 204.8/sec 975.100/sec 187.9642/sec total: 692348 2023-06-27T09:46:07.831Z|00359|coverage(revalidator4)|INFO|flow_extract 204.8/sec 975.100/sec 187.9636/sec total: 692346 2023-06-27T09:46:07.831Z|00360|coverage(revalidator4)|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.1056/sec total: 380 2023-06-27T09:46:07.831Z|00361|coverage(revalidator4)|INFO|hmap_pathological 0.0/sec 0.033/sec 0.0167/sec total: 62 2023-06-27T09:46:07.831Z|00362|coverage(revalidator4)|INFO|hmap_expand 10.4/sec 2.217/sec 2.3725/sec total: 8552 2023-06-27T09:46:07.831Z|00363|coverage(revalidator4)|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0011/sec total: 4 2023-06-27T09:46:07.831Z|00364|coverage(revalidator4)|INFO|mac_learning_learned 114.8/sec 863.117/sec 35.1539/sec total: 140696 2023-06-27T09:46:07.831Z|00365|coverage(revalidator4)|INFO|mac_learning_evicted 0.0/sec 0.000/sec 0.0000/sec total: 4305 2023-06-27T09:46:07.831Z|00366|coverage(revalidator4)|INFO|netdev_get_stats 1.2/sec 1.100/sec 0.7033/sec total: 2532 2023-06-27T09:46:07.831Z|00367|coverage(revalidator4)|INFO|txn_unchanged 0.0/sec 0.150/sec 0.1192/sec total: 429 2023-06-27T09:46:07.831Z|00368|coverage(revalidator4)|INFO|txn_incomplete 0.2/sec 0.033/sec 0.0128/sec total: 46 2023-06-27T09:46:07.832Z|00369|coverage(revalidator4)|INFO|txn_success 0.2/sec 0.017/sec 0.0106/sec total: 39 2023-06-27T09:46:07.832Z|00370|coverage(revalidator4)|INFO|poll_create_node 20.8/sec 62.850/sec 21.3061/sec total: 78112 2023-06-27T09:46:07.832Z|00371|coverage(revalidator4)|INFO|poll_zero_timeout 3.4/sec 15.467/sec 3.1839/sec total: 11714 2023-06-27T09:46:07.832Z|00372|coverage(revalidator4)|INFO|seq_change 4.6/sec 38.800/sec 26.0406/sec total: 97597 2023-06-27T09:46:07.832Z|00373|coverage(revalidator4)|INFO|pstream_open 0.0/sec 0.000/sec 0.0014/sec total: 5 2023-06-27T09:46:07.832Z|00374|coverage(revalidator4)|INFO|stream_open 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:07.832Z|00375|coverage(revalidator4)|INFO|unixctl_received 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:07.832Z|00376|coverage(revalidator4)|INFO|unixctl_replied 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:07.832Z|00377|coverage(revalidator4)|INFO|util_xalloc 1542.2/sec 7551.050/sec 1020.5014/sec total: 4142565 2023-06-27T09:46:07.832Z|00378|coverage(revalidator4)|INFO|netdev_set_policing 0.0/sec 0.000/sec 0.0350/sec total: 126 2023-06-27T09:46:07.832Z|00379|coverage(revalidator4)|INFO|netdev_get_ethtool 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:07.833Z|00380|coverage(revalidator4)|INFO|netlink_overflow 0.0/sec 0.133/sec 0.0053/sec total: 21 2023-06-27T09:46:07.834Z|00381|coverage(revalidator4)|INFO|netlink_received 321.6/sec 1843.233/sec 280.2078/sec total: 1133154 2023-06-27T09:46:07.834Z|00382|coverage(revalidator4)|INFO|netlink_recv_jumbo 0.8/sec 15.267/sec 1.6806/sec total: 6055 2023-06-27T09:46:07.834Z|00383|coverage(revalidator4)|INFO|netlink_sent 321.6/sec 1842.383/sec 279.1903/sec total: 1128952 2023-06-27T09:46:07.834Z|00384|coverage(revalidator4)|INFO|nln_changed 0.0/sec 0.000/sec 0.0083/sec total: 30 2023-06-27T09:46:07.834Z|00385|coverage(revalidator4)|INFO|108 events never hit 2023-06-27T09:46:07.878Z|00386|ofproto_dpif_upcall(revalidator4)|WARN|Number of datapath flows (68854) twice as high as current dynamic flow limit (14698). Starting to delete flows unconditionally as an emergency measure. 2023-06-27T09:46:08.010Z|00099|ofproto_dpif_upcall(handler3)|WARN|Dropped 570375 log messages in last 1624 seconds (most recently, 1611 seconds ago) due to excessive rate 2023-06-27T09:46:08.010Z|00100|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 7349 flows. 2023-06-27T09:46:08.011Z|00101|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 7349 flows. 2023-06-27T09:46:08.012Z|00102|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 7349 flows. 2023-06-27T09:46:08.013Z|00103|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 7349 flows. 2023-06-27T09:46:08.014Z|00104|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 7349 flows. 2023-06-27T09:46:08.203Z|00105|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:09.953Z|00106|timeval(handler3)|WARN|Unreasonably long 1731ms poll interval (0ms user, 0ms system) 2023-06-27T09:46:09.953Z|00107|timeval(handler3)|WARN|context switches: 0 voluntary, 18 involuntary 2023-06-27T09:46:09.954Z|00108|coverage(handler3)|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=2bdec87d: 2023-06-27T09:46:09.954Z|00109|coverage(handler3)|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0058/sec total: 21 2023-06-27T09:46:09.954Z|00110|coverage(handler3)|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:09.954Z|00111|coverage(handler3)|INFO|ofproto_update_port 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:09.954Z|00112|coverage(handler3)|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:09.954Z|00113|coverage(handler3)|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:09.954Z|00114|coverage(handler3)|INFO|rev_mac_learning 0.0/sec 0.033/sec 0.0347/sec total: 125 2023-06-27T09:46:09.954Z|00115|coverage(handler3)|INFO|dumped_duplicate_flow 29.6/sec 2.467/sec 0.0431/sec total: 155 2023-06-27T09:46:09.954Z|00116|coverage(handler3)|INFO|handler_duplicate_upcall 672.8/sec 167.400/sec 4.1753/sec total: 15031 2023-06-27T09:46:09.954Z|00117|coverage(handler3)|INFO|revalidate_missed_dp_flow 336.2/sec 28.017/sec 0.5742/sec total: 2067 2023-06-27T09:46:09.954Z|00118|coverage(handler3)|INFO|upcall_flow_limit_hit 492.4/sec 41.033/sec 159.5478/sec total: 574372 2023-06-27T09:46:09.954Z|00119|coverage(handler3)|INFO|upcall_ukey_replace 0.0/sec 0.267/sec 0.2253/sec total: 811 2023-06-27T09:46:09.954Z|00120|coverage(handler3)|INFO|xlate_actions 17310.0/sec 2417.600/sec 273.9347/sec total: 986165 2023-06-27T09:46:09.954Z|00121|coverage(handler3)|INFO|cmap_expand 48.4/sec 46.700/sec 1.5231/sec total: 5483 2023-06-27T09:46:09.954Z|00122|coverage(handler3)|INFO|cmap_shrink 0.0/sec 0.000/sec 0.7383/sec total: 2658 2023-06-27T09:46:09.954Z|00123|coverage(handler3)|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:09.954Z|00124|coverage(handler3)|INFO|dpif_flow_get 66.0/sec 5.500/sec 0.1744/sec total: 628 2023-06-27T09:46:09.954Z|00125|coverage(handler3)|INFO|dpif_flow_put 8091.2/sec 1538.033/sec 47.2661/sec total: 170158 2023-06-27T09:46:09.954Z|00126|coverage(handler3)|INFO|dpif_flow_del 4.8/sec 0.400/sec 11.8544/sec total: 42676 2023-06-27T09:46:09.954Z|00127|coverage(handler3)|INFO|dpif_execute 4577.8/sec 1356.583/sec 194.3222/sec total: 699560 2023-06-27T09:46:09.954Z|00128|coverage(handler3)|INFO|flow_extract 4577.8/sec 1356.583/sec 194.3217/sec total: 699558 2023-06-27T09:46:09.954Z|00129|coverage(handler3)|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.1056/sec total: 380 2023-06-27T09:46:09.954Z|00130|coverage(handler3)|INFO|hmap_pathological 0.4/sec 0.067/sec 0.0172/sec total: 62 2023-06-27T09:46:09.954Z|00131|coverage(handler3)|INFO|hmap_expand 2.2/sec 2.300/sec 2.3756/sec total: 8552 2023-06-27T09:46:09.954Z|00132|coverage(handler3)|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0011/sec total: 4 2023-06-27T09:46:09.954Z|00133|coverage(handler3)|INFO|mac_learning_learned 3412.6/sec 1147.500/sec 39.8936/sec total: 143617 2023-06-27T09:46:09.954Z|00134|coverage(handler3)|INFO|mac_learning_evicted 1445.2/sec 120.433/sec 2.0072/sec total: 7226 2023-06-27T09:46:09.954Z|00135|coverage(handler3)|INFO|netdev_get_stats 0.0/sec 1.000/sec 0.7033/sec total: 2532 2023-06-27T09:46:09.954Z|00136|coverage(handler3)|INFO|txn_unchanged 0.0/sec 0.133/sec 0.1192/sec total: 429 2023-06-27T09:46:09.954Z|00137|coverage(handler3)|INFO|txn_incomplete 0.0/sec 0.033/sec 0.0128/sec total: 46 2023-06-27T09:46:09.954Z|00138|coverage(handler3)|INFO|txn_success 0.2/sec 0.033/sec 0.0108/sec total: 39 2023-06-27T09:46:09.954Z|00139|coverage(handler3)|INFO|poll_create_node 342.0/sec 90.750/sec 21.7811/sec total: 78412 2023-06-27T09:46:09.954Z|00140|coverage(handler3)|INFO|poll_zero_timeout 82.8/sec 22.367/sec 3.2989/sec total: 11876 2023-06-27T09:46:09.954Z|00141|coverage(handler3)|INFO|seq_change 820.0/sec 107.117/sec 27.1794/sec total: 97846 2023-06-27T09:46:09.954Z|00142|coverage(handler3)|INFO|pstream_open 0.0/sec 0.000/sec 0.0014/sec total: 5 2023-06-27T09:46:09.954Z|00143|coverage(handler3)|INFO|stream_open 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:09.954Z|00144|coverage(handler3)|INFO|unixctl_received 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:09.954Z|00145|coverage(handler3)|INFO|unixctl_replied 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:09.954Z|00146|coverage(handler3)|INFO|util_xalloc 101159.6/sec 15976.250/sec 1161.0008/sec total: 4179603 2023-06-27T09:46:09.954Z|00147|coverage(handler3)|INFO|netdev_set_policing 0.0/sec 0.000/sec 0.0350/sec total: 126 2023-06-27T09:46:09.954Z|00148|coverage(handler3)|INFO|netdev_get_ethtool 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:09.954Z|00149|coverage(handler3)|INFO|netlink_overflow 0.6/sec 0.183/sec 0.0061/sec total: 22 2023-06-27T09:46:09.954Z|00150|coverage(handler3)|INFO|netlink_received 26912.0/sec 4085.733/sec 317.5856/sec total: 1143308 2023-06-27T09:46:09.954Z|00151|coverage(handler3)|INFO|netlink_recv_jumbo 1.0/sec 15.283/sec 1.6819/sec total: 6055 2023-06-27T09:46:09.954Z|00152|coverage(handler3)|INFO|netlink_sent 26802.8/sec 4075.783/sec 316.4164/sec total: 1139099 2023-06-27T09:46:09.954Z|00153|coverage(handler3)|INFO|nln_changed 0.0/sec 0.000/sec 0.0083/sec total: 30 2023-06-27T09:46:09.954Z|00154|coverage(handler3)|INFO|108 events never hit 2023-06-27T09:46:14.654Z|00118|ovs_rcu|WARN|blocked 1000 ms waiting for revalidator4 to quiesce 2023-06-27T09:46:15.654Z|00119|ovs_rcu|WARN|blocked 2000 ms waiting for revalidator4 to quiesce 2023-06-27T09:46:15.954Z|00155|poll_loop(handler3)|INFO|Dropped 207 log messages in last 20 seconds (most recently, 17 seconds ago) due to excessive rate 2023-06-27T09:46:15.955Z|00156|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:46:15.955Z|00157|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:46:15.956Z|00158|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:46:16.057Z|00387|ofproto_dpif_upcall(revalidator4)|INFO|Spent an unreasonably long 8181ms dumping flows 2023-06-27T09:46:16.057Z|00388|timeval(revalidator4)|WARN|Unreasonably long 8223ms poll interval (915ms user, 1513ms system) 2023-06-27T09:46:16.057Z|00389|timeval(revalidator4)|WARN|faults: 241 minor, 0 major 2023-06-27T09:46:16.057Z|00390|timeval(revalidator4)|WARN|context switches: 426 voluntary, 59285 involuntary 2023-06-27T09:46:16.058Z|00391|coverage(revalidator4)|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=e4598588: 2023-06-27T09:46:16.058Z|00392|coverage(revalidator4)|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0058/sec total: 21 2023-06-27T09:46:16.058Z|00393|coverage(revalidator4)|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:16.058Z|00394|coverage(revalidator4)|INFO|ofproto_update_port 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:16.058Z|00395|coverage(revalidator4)|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-06-27T09:46:16.058Z|00396|coverage(revalidator4)|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:16.058Z|00397|coverage(revalidator4)|INFO|rev_mac_learning 0.4/sec 0.067/sec 0.0353/sec total: 127 2023-06-27T09:46:16.058Z|00398|coverage(revalidator4)|INFO|dumped_duplicate_flow 0.0/sec 2.467/sec 0.0431/sec total: 155 2023-06-27T09:46:16.058Z|00399|coverage(revalidator4)|INFO|handler_duplicate_upcall 0.0/sec 167.400/sec 4.1753/sec total: 15031 2023-06-27T09:46:16.058Z|00400|coverage(revalidator4)|INFO|revalidate_missed_dp_flow 0.0/sec 28.017/sec 0.5742/sec total: 2067 2023-06-27T09:46:16.058Z|00401|coverage(revalidator4)|INFO|upcall_flow_limit_hit 33709.2/sec 2850.133/sec 206.3661/sec total: 904421 2023-06-27T09:46:16.058Z|00402|coverage(revalidator4)|INFO|upcall_flow_limit_kill 0.0/sec 0.000/sec 0.0000/sec total: 1091 2023-06-27T09:46:16.058Z|00403|coverage(revalidator4)|INFO|upcall_ukey_replace 0.0/sec 0.267/sec 0.2253/sec total: 811 2023-06-27T09:46:16.058Z|00404|coverage(revalidator4)|INFO|xlate_actions 33709.2/sec 5226.700/sec 320.7531/sec total: 1335513 2023-06-27T09:46:16.058Z|00405|coverage(revalidator4)|INFO|cmap_expand 0.0/sec 46.700/sec 1.5231/sec total: 5483 2023-06-27T09:46:16.058Z|00406|coverage(revalidator4)|INFO|cmap_shrink 0.0/sec 0.000/sec 0.7383/sec total: 3355 2023-06-27T09:46:16.058Z|00407|coverage(revalidator4)|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:16.058Z|00408|coverage(revalidator4)|INFO|dpif_flow_get 0.0/sec 5.500/sec 0.1744/sec total: 628 2023-06-27T09:46:16.058Z|00409|coverage(revalidator4)|INFO|dpif_flow_put 0.0/sec 1538.033/sec 47.2661/sec total: 170158 2023-06-27T09:46:16.058Z|00410|coverage(revalidator4)|INFO|dpif_flow_del 0.0/sec 0.400/sec 11.8544/sec total: 98972 2023-06-27T09:46:16.058Z|00411|coverage(revalidator4)|INFO|dpif_execute 33709.2/sec 4165.683/sec 241.1406/sec total: 1028231 2023-06-27T09:46:16.058Z|00412|coverage(revalidator4)|INFO|flow_extract 33709.2/sec 4165.683/sec 241.1400/sec total: 1028229 2023-06-27T09:46:16.059Z|00413|coverage(revalidator4)|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.1056/sec total: 380 2023-06-27T09:46:16.059Z|00414|coverage(revalidator4)|INFO|hmap_pathological 0.0/sec 0.067/sec 0.0172/sec total: 62 2023-06-27T09:46:16.059Z|00415|coverage(revalidator4)|INFO|hmap_expand 15.0/sec 3.450/sec 2.3964/sec total: 8627 2023-06-27T09:46:16.059Z|00416|coverage(revalidator4)|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0011/sec total: 4 2023-06-27T09:46:16.059Z|00417|coverage(revalidator4)|INFO|mac_learning_learned 60.6/sec 1152.550/sec 39.9778/sec total: 144023 2023-06-27T09:46:16.059Z|00418|coverage(revalidator4)|INFO|mac_learning_evicted 60.6/sec 125.483/sec 2.0914/sec total: 7632 2023-06-27T09:46:16.059Z|00419|coverage(revalidator4)|INFO|netdev_get_stats 2.4/sec 1.100/sec 0.7067/sec total: 2544 2023-06-27T09:46:16.059Z|00420|coverage(revalidator4)|INFO|txn_unchanged 0.0/sec 0.117/sec 0.1192/sec total: 429 2023-06-27T09:46:16.059Z|00421|coverage(revalidator4)|INFO|txn_incomplete 0.4/sec 0.067/sec 0.0133/sec total: 48 2023-06-27T09:46:16.059Z|00422|coverage(revalidator4)|INFO|txn_success 0.2/sec 0.050/sec 0.0111/sec total: 40 2023-06-27T09:46:16.059Z|00423|coverage(revalidator4)|INFO|poll_create_node 1057.4/sec 178.267/sec 23.2497/sec total: 88995 2023-06-27T09:46:16.059Z|00424|coverage(revalidator4)|INFO|poll_zero_timeout 527.6/sec 66.333/sec 4.0317/sec total: 17018 2023-06-27T09:46:16.059Z|00425|coverage(revalidator4)|INFO|seq_change 529.0/sec 151.183/sec 27.9142/sec total: 105443 2023-06-27T09:46:16.059Z|00426|coverage(revalidator4)|INFO|pstream_open 0.0/sec 0.000/sec 0.0014/sec total: 5 2023-06-27T09:46:16.059Z|00427|coverage(revalidator4)|INFO|stream_open 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-06-27T09:46:16.059Z|00428|coverage(revalidator4)|INFO|unixctl_received 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:16.059Z|00429|coverage(revalidator4)|INFO|unixctl_replied 0.0/sec 0.000/sec 0.0103/sec total: 37 2023-06-27T09:46:16.059Z|00430|coverage(revalidator4)|INFO|util_xalloc 103784.6/sec 24620.200/sec 1305.1461/sec total: 5418626 2023-06-27T09:46:16.059Z|00431|coverage(revalidator4)|INFO|netdev_set_policing 0.0/sec 0.000/sec 0.0350/sec total: 126 2023-06-27T09:46:16.059Z|00432|coverage(revalidator4)|INFO|netdev_get_ethtool 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-06-27T09:46:16.059Z|00433|coverage(revalidator4)|INFO|netlink_overflow 4.4/sec 0.550/sec 0.0122/sec total: 63 2023-06-27T09:46:16.059Z|00434|coverage(revalidator4)|INFO|netlink_received 33731.8/sec 6896.550/sec 364.4353/sec total: 1585212 2023-06-27T09:46:16.059Z|00435|coverage(revalidator4)|INFO|netlink_recv_jumbo 1.6/sec 15.350/sec 1.6842/sec total: 6064 2023-06-27T09:46:16.059Z|00436|coverage(revalidator4)|INFO|netlink_sent 33725.6/sec 6886.083/sec 363.2575/sec total: 1580548 2023-06-27T09:46:16.059Z|00437|coverage(revalidator4)|INFO|nln_changed 0.0/sec 0.000/sec 0.0083/sec total: 30 2023-06-27T09:46:16.060Z|00438|coverage(revalidator4)|INFO|107 events never hit 2023-06-27T09:46:19.127Z|00159|dpif_netlink(handler3)|WARN|Dropped 46 log messages in last 9 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:46:19.127Z|00160|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:21.955Z|00161|poll_loop(handler3)|INFO|Dropped 5876 log messages in last 6 seconds (most recently, 3 seconds ago) due to excessive rate 2023-06-27T09:46:21.955Z|00162|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:46:25.216Z|00163|poll_loop(handler3)|INFO|Dropped 4365 log messages in last 4 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:46:25.216Z|00164|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (76% CPU usage) 2023-06-27T09:46:31.071Z|00165|dpif_netlink(handler3)|WARN|Dropped 98 log messages in last 12 seconds (most recently, 1 seconds ago) due to excessive rate 2023-06-27T09:46:31.071Z|00166|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:31.215Z|00167|poll_loop(handler3)|INFO|Dropped 20068 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:46:31.215Z|00168|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (53% CPU usage) 2023-06-27T09:46:37.216Z|00169|poll_loop(handler3)|INFO|Dropped 8052 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:46:37.216Z|00170|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (76% CPU usage) 2023-06-27T09:46:43.684Z|00171|dpif_netlink(handler3)|WARN|Dropped 99 log messages in last 12 seconds (most recently, 2 seconds ago) due to excessive rate 2023-06-27T09:46:43.684Z|00172|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:46.153Z|00173|poll_loop(handler3)|INFO|Dropped 20353 log messages in last 9 seconds (most recently, 4 seconds ago) due to excessive rate 2023-06-27T09:46:46.154Z|00174|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (61% CPU usage) 2023-06-27T09:46:49.216Z|00175|poll_loop(handler3)|INFO|Dropped 3997 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:46:49.219Z|00176|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (74% CPU usage) 2023-06-27T09:46:55.558Z|00177|dpif_netlink(handler3)|WARN|Dropped 85 log messages in last 12 seconds (most recently, 2 seconds ago) due to excessive rate 2023-06-27T09:46:55.558Z|00178|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:46:58.524Z|00179|poll_loop(handler3)|INFO|Dropped 2181 log messages in last 9 seconds (most recently, 6 seconds ago) due to excessive rate 2023-06-27T09:46:58.525Z|00180|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:47:01.219Z|00181|poll_loop(handler3)|INFO|Dropped 3472 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:01.219Z|00182|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:47:07.216Z|00183|poll_loop(handler3)|INFO|Dropped 13128 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:07.216Z|00184|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (74% CPU usage) 2023-06-27T09:47:07.935Z|00185|dpif_netlink(handler3)|WARN|Dropped 98 log messages in last 12 seconds (most recently, 2 seconds ago) due to excessive rate 2023-06-27T09:47:07.935Z|00186|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:47:08.011Z|00187|ofproto_dpif_upcall(handler3)|WARN|Dropped 3925670 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:08.011Z|00188|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 4000 flows. 2023-06-27T09:47:13.216Z|00189|poll_loop(handler3)|INFO|Dropped 9026 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:13.216Z|00190|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (65% CPU usage) 2023-06-27T09:47:19.215Z|00191|poll_loop(handler3)|INFO|Dropped 9501 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:19.216Z|00192|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (75% CPU usage) 2023-06-27T09:47:20.360Z|00193|dpif_netlink(handler3)|WARN|Dropped 73 log messages in last 12 seconds (most recently, 3 seconds ago) due to excessive rate 2023-06-27T09:47:20.360Z|00194|dpif_netlink(handler3)|WARN|system@ovs-system: lost packet on handler 0 2023-06-27T09:47:25.216Z|00195|poll_loop(handler3)|INFO|Dropped 10557 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:25.216Z|00196|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (92% CPU usage) 2023-06-27T09:47:31.239Z|00197|poll_loop(handler3)|INFO|Dropped 8927 log messages in last 6 seconds (most recently, 1 seconds ago) due to excessive rate 2023-06-27T09:47:31.239Z|00198|poll_loop(handler3)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (99% CPU usage) 2023-06-27T09:47:37.216Z|00199|poll_loop(handler3)|INFO|Dropped 9746 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:37.216Z|00200|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:47:43.604Z|00201|poll_loop(handler3)|INFO|Dropped 6880 log messages in last 6 seconds (most recently, 1 seconds ago) due to excessive rate 2023-06-27T09:47:43.604Z|00202|poll_loop(handler3)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (71% CPU usage) 2023-06-27T09:47:49.216Z|00203|poll_loop(handler3)|INFO|Dropped 11062 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:47:49.216Z|00204|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (70% CPU usage) 2023-06-27T09:47:55.962Z|00205|poll_loop(handler3)|INFO|Dropped 7519 log messages in last 7 seconds (most recently, 1 seconds ago) due to excessive rate 2023-06-27T09:47:55.962Z|00206|poll_loop(handler3)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (72% CPU usage) 2023-06-27T09:48:01.216Z|00207|poll_loop(handler3)|INFO|Dropped 11401 log messages in last 5 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:01.217Z|00208|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:48:07.216Z|00209|poll_loop(handler3)|INFO|Dropped 8484 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:07.216Z|00210|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:48:08.315Z|00211|ofproto_dpif_upcall(handler3)|WARN|Dropped 4300776 log messages in last 60 seconds (most recently, 1 seconds ago) due to excessive rate 2023-06-27T09:48:08.315Z|00212|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 4000 flows. 2023-06-27T09:48:13.216Z|00213|poll_loop(handler3)|INFO|Dropped 10109 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:13.216Z|00214|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:48:19.216Z|00215|poll_loop(handler3)|INFO|Dropped 8278 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:19.216Z|00216|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:48:25.216Z|00217|poll_loop(handler3)|INFO|Dropped 8677 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:25.216Z|00218|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:48:31.216Z|00219|poll_loop(handler3)|INFO|Dropped 8537 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:31.216Z|00220|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (72% CPU usage) 2023-06-27T09:48:37.363Z|00221|poll_loop(handler3)|INFO|Dropped 10234 log messages in last 6 seconds (most recently, 3 seconds ago) due to excessive rate 2023-06-27T09:48:37.363Z|00222|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:48:43.216Z|00223|poll_loop(handler3)|INFO|Dropped 7198 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:43.218Z|00224|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (71% CPU usage) 2023-06-27T09:48:52.362Z|00225|poll_loop(handler3)|INFO|Dropped 10608 log messages in last 9 seconds (most recently, 6 seconds ago) due to excessive rate 2023-06-27T09:48:52.363Z|00226|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:48:55.215Z|00227|poll_loop(handler3)|INFO|Dropped 29364 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:48:55.216Z|00228|poll_loop(handler3)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (54% CPU usage) 2023-06-27T09:49:04.362Z|00229|poll_loop(handler3)|INFO|Dropped 22998 log messages in last 9 seconds (most recently, 6 seconds ago) due to excessive rate 2023-06-27T09:49:04.363Z|00230|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:07.215Z|00231|poll_loop(handler3)|INFO|Dropped 29388 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:49:07.216Z|00232|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:08.009Z|00233|ofproto_dpif_upcall(handler3)|WARN|Dropped 3202483 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:49:08.009Z|00234|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 4000 flows. 2023-06-27T09:49:16.362Z|00235|poll_loop(handler3)|INFO|Dropped 22186 log messages in last 9 seconds (most recently, 6 seconds ago) due to excessive rate 2023-06-27T09:49:16.363Z|00236|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:19.215Z|00237|poll_loop(handler3)|INFO|Dropped 28729 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:49:19.216Z|00238|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:28.503Z|00239|poll_loop(handler3)|INFO|Dropped 23634 log messages in last 9 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:49:28.503Z|00240|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:31.215Z|00241|poll_loop(handler3)|INFO|Dropped 30119 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:49:31.215Z|00242|poll_loop(handler3)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (54% CPU usage) 2023-06-27T09:49:40.869Z|00243|poll_loop(handler3)|INFO|Dropped 25783 log messages in last 9 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:49:40.869Z|00244|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:43.215Z|00245|poll_loop(handler3)|INFO|Dropped 23382 log messages in last 3 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:49:43.216Z|00246|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:49:53.210Z|00247|poll_loop(handler3)|INFO|Dropped 32457 log messages in last 10 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:49:53.211Z|00248|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:49:55.215Z|00249|poll_loop(handler3)|INFO|Dropped 20361 log messages in last 2 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:49:55.215Z|00250|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:50:05.558Z|00251|poll_loop(handler3)|INFO|Dropped 35900 log messages in last 10 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:50:05.558Z|00252|poll_loop(handler3)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (55% CPU usage) 2023-06-27T09:50:07.215Z|00253|poll_loop(handler3)|INFO|Dropped 17728 log messages in last 2 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:50:07.215Z|00254|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:50:08.009Z|00255|ofproto_dpif_upcall(handler3)|WARN|Dropped 2167622 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:50:08.009Z|00256|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 4000 flows. 2023-06-27T09:50:17.914Z|00257|poll_loop(handler3)|INFO|Dropped 39355 log messages in last 10 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:50:17.914Z|00258|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:50:19.216Z|00259|poll_loop(handler3)|INFO|Dropped 13703 log messages in last 2 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:50:19.216Z|00260|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:50:30.325Z|00261|poll_loop(handler3)|INFO|Dropped 32619 log messages in last 11 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:50:30.326Z|00262|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (53% CPU usage) 2023-06-27T09:50:31.215Z|00263|poll_loop(handler3)|INFO|Dropped 9403 log messages in last 1 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:50:31.215Z|00264|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (53% CPU usage) 2023-06-27T09:50:42.692Z|00265|poll_loop(handler3)|INFO|Dropped 46965 log messages in last 11 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:50:42.693Z|00266|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:50:43.215Z|00267|poll_loop(handler3)|INFO|Dropped 6036 log messages in last 1 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:50:43.215Z|00268|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (55% CPU usage) 2023-06-27T09:50:55.028Z|00269|poll_loop(handler3)|INFO|Dropped 50795 log messages in last 12 seconds (most recently, 7 seconds ago) due to excessive rate 2023-06-27T09:50:55.028Z|00270|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:50:55.215Z|00271|poll_loop(handler3)|INFO|Dropped 2143 log messages in last 0 seconds (most recently, 0 seconds ago) due to excessive rate 2023-06-27T09:50:55.215Z|00272|poll_loop(handler3)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (54% CPU usage) 2023-06-27T09:56:44.990Z|00273|ofproto_dpif_upcall(handler3)|WARN|Dropped 1913123 log messages in last 397 seconds (most recently, 345 seconds ago) due to excessive rate 2023-06-27T09:56:44.990Z|00274|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 4000 flows. 2023-06-27T09:56:44.990Z|00275|ofproto_dpif_upcall(handler3)|WARN|upcall: datapath reached the dynamic limit of 4000 flows.
So when looking at the system behavior during the times OVS reports long poll times, it looks like the kernel is oversubscribed. There is more work thant the CPU threads can handle. The system only has two threads (one on each NUMA) assigned to the kernel and threads running. All the other threads are isolated with tuned's realtime-virtual-host profile. Not sure why this is configured, so when testing I added more threads to the kernel. I removed threads 48 till 55 from the isolated list, and assigned them to the kernel. After a reboot, I no longer see any problems with the test cases, all revalidator runs are within 500ms. However with default parameters this will result in 3 revalidator threads. To make the test more queal, giving it 1 revalidator, it also passed. This is done with 'ovs-vsctl set Open_vSwitch . other-config:n-revalidator-threads=1' after the script reconfigures ovs. So the conclusion is that we need to increase the number of threads available for the kernel and OVS. I think even MicroShift has 4 threads for kerne and OVS. Here are some of the debug outputs showing OVS fighting for resources (here a revalidator run takes 1 second, and OVS threads are waiting to be put on the CPU for more than 1.4second in a 1 second window, THREAD_READY_STATISTICS. There might be other (kernel)threads also fighting for the CPU time): # Start sampling (trigger@76250485999291) @2023-07-07T09:01:29.411088 (13:01:29 UTC) # Stop sampling (trigger@76251507372624) @2023-07-07T09:01:30.432529 (13:01:30 UTC) # Triggered sample dump, stop-start delta 1,021,373,333 ns @2023-07-07T09:01:30.432619 (13:01:30 UTC) TID THREAD <RESOURCE SPECIFIC> ---------- ---------------- ---------------------------------------------------------------------------- 4160013 <unknown:4157952/4160013> [SYSCALL STATISTICS] NAME NUMBER COUNT TOTAL ns MAX ns madvise 28 1 3,937 3,937 futex 202 1 1,382 1,382 TOTAL( - poll): 2 5,319 [THREAD RUN STATISTICS] SCHED_CNT TOTAL ns MIN ns MAX ns 2 56,270 24,714 31,556 [THREAD READY STATISTICS] SCHED_CNT TOTAL ns MAX ns 2 118,885 116,490 4157954 handler1 [SYSCALL STATISTICS] NAME NUMBER COUNT TOTAL ns MAX ns sendmsg 46 2955 643,299,950 134,692,681 futex 202 515 292,299,879 157,601,468 recvmsg 47 5771 14,259,408 3,503,784 mprotect 10 1338 5,278,404 50,088 write 1 60 1,157,917 60,486 poll 7 88 55,768 1,282 getrusage 98 44 31,256 1,197 TOTAL( - poll): 10683 956,326,814 [THREAD RUN STATISTICS] SCHED_CNT TOTAL ns MIN ns MAX ns 3419 289,917,455 2,791 9,979,606 [THREAD READY STATISTICS] SCHED_CNT TOTAL ns MAX ns 3418 668,495,620 157,596,608 4157952 ovs-vswitchd [SYSCALL STATISTICS] NAME NUMBER COUNT TOTAL ns MAX ns poll 7 416 712,171,369 239,808,423 futex 202 805 102,034,906 16,712,043 sendto 44 3 13,529,652 13,024,428 getdents64 217 50 4,794,491 4,187,097 read 0 417 3,869,498 3,486,836 accept 43 126 1,403,232 819,416 getrusage 98 417 321,120 32,248 write 1 26 303,700 22,986 sendmsg 46 10 248,124 44,550 openat 257 25 214,149 12,859 recvmsg 47 212 204,377 4,867 recvfrom 45 29 40,492 4,703 fstat 5 25 31,554 1,495 close 3 26 24,193 1,635 fcntl 72 2 2,177 1,536 TOTAL( - poll): 2173 127,021,665 [THREAD RUN STATISTICS] SCHED_CNT TOTAL ns MIN ns MAX ns 1276 137,654,258 2,793 10,514,708 [THREAD READY STATISTICS] SCHED_CNT TOTAL ns MAX ns 1276 106,112,824 16,695,005 4157957 revalidator5 [SYSCALL STATISTICS] NAME NUMBER COUNT TOTAL ns MAX ns write 1 1072 257,549,046 11,645,365 recvmsg 47 240 147,566,491 11,370,868 futex 202 1261 108,237,237 13,309,978 sendto 44 9 3,500,185 1,789,136 mprotect 10 128 567,083 23,619 sendmsg 46 9 86,170 11,205 TOTAL( - poll): 2719 517,506,212 [THREAD RUN STATISTICS] SCHED_CNT TOTAL ns MIN ns MAX ns 3376 290,156,705 3,088 5,042,013 [THREAD READY STATISTICS] SCHED_CNT TOTAL ns MAX ns 3377 730,563,978 166,652,188 4157958 urcu4 [SYSCALL STATISTICS] NAME NUMBER COUNT TOTAL ns MAX ns poll 7 807 943,998,370 179,496,065 futex 202 1512 70,932,980 7,966,602 read 0 768 911,434 2,954 getrusage 98 768 641,360 30,354 mprotect 10 6 47,966 10,180 write 1 3 7,226 2,543 TOTAL( - poll): 3057 72,540,966 [THREAD RUN STATISTICS] SCHED_CNT TOTAL ns MIN ns MAX ns 2284 12,374,419 2,208 19,872 [THREAD READY STATISTICS] SCHED_CNT TOTAL ns MAX ns 2284 4,317,868 188,755
Hello Eelco So many thanks for your debugging. I need to highlight why cpu isolation as seen. > The system only has two threads (one on each NUMA) assigned to the kernel and threads running. All the other threads are isolated with tuned's realtime-virtual-host profile. From realtime-virtual-host profile, the default isolation strategy is reserve 1 core per socket for housekeeping, isolate the rest. # cat /etc/tuned/realtime-virtual-host-variables.conf # # Variable settings below override the definitions from the # /etc/tuned/realtime-variables.conf file. # # Examples: # isolated_cores=2,4-7 # isolated_cores=2-23 # # Reserve 1 core per socket for housekeeping, isolate the rest. isolated_cores=${f:calc_isolated_cores:1} > So the conclusion is that we need to increase the number of threads available for the kernel and OVS. I think even MicroShift has 4 threads for kerne and OVS. If I'm misunderstanding anything, please let me know. Current openvswitch Hardware offload need more cpu resource to meet more 10k flows test scenario. Because ovs needs to offload hardware for 10k flows (or higher flows). The handler threads and revalidator threads responsible for offload are now running in the housekeeping cpu. From kernel-rt, I didn't recommend ovs threads(handler + revalidator) mix up with kernel thread. In the CPU isolation strategy of kernel rt, the CPU of housekeeping is only used as the basic CPU required for the kernel to run. Can the handler threads or revalidator threads of openvswitch have parameter configuration so that it can run on an/multiple isolated cpu/s? Because I know that openvswitch can adjust the number of threads through n-revalidator-threads and n-handler-threads. If ovs threads have other reasons to be placed in housekeeping, I will change the default isolation strategy.
(In reply to mhou from comment #4) > > The system only has two threads (one on each NUMA) assigned to the kernel and threads running. All the other threads are isolated with tuned's realtime-virtual-host profile. > From realtime-virtual-host profile, the default isolation strategy is > reserve 1 core per socket for housekeeping, isolate the rest. I guess OVS seems to need more during the flow setup due to the 40K upcalls that happen quickly. This results in the revalidator scaling down the DP flows, resulting in flows hitting the upcalls. > > So the conclusion is that we need to increase the number of threads available for the kernel and OVS. I think even MicroShift has 4 threads for kerne and OVS. > > If I'm misunderstanding anything, please let me know. Current openvswitch > Hardware offload need more cpu resource to meet more 10k flows test > scenario. Because ovs needs to offload hardware for 10k flows (or higher > flows). The handler threads and revalidator threads responsible for offload > are now running in the housekeeping CPU. OVS does not run on isolated CPUs only CPU available general purpose processing. As you quickly sent upcall for 40K flows with this test it overloads the system. > From kernel-rt, I didn't recommend ovs threads(handler + revalidator) mix up > with kernel thread. In the CPU isolation strategy of kernel rt, the CPU of > housekeeping is only used as the basic CPU required for the kernel to run. > > Can the handler threads or revalidator threads of openvswitch have parameter > configuration so that it can run on an/multiple isolated cpu/s? There is not manual distribution configuration for these threads, the kernel scheduler is responsible for this. > Because I know that openvswitch can adjust the number of threads through > n-revalidator-threads and n-handler-threads. > > If ovs threads have other reasons to be placed in housekeeping, I will > change the default isolation strategy. OVS depends on the kernel scheduler to be put on a CPU, so when CPUs are isolated the kernel will not schedule tasks on these threads. I guess you could override this with taskset.
Test on openvswitch2.17-2.17.0-105.el8fdp, and the analysis as blow. If just keep numa0 cpus except core0 as isolation cpus, any other cpu of numa1 and core0 of numa0 as housekeeping. The Housekeeping cpu have 29. # cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-499.rt7.288.el8.x86_64 root=/dev/mapper/rhel_dell--per750--03-root ro ksdevice=bootif pci=realloc resume=/dev/mapper/rhel_dell--per750--03-swap rd.lvm.lv=rhel_dell-per750-03/root rd.lvm.lv=rhel_dell-per750-03/swap console=ttyS0,115200n81 skew_tick=1 isolcpus=managed_irq,domain,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54 intel_pstate=disable nosoftlockup tsc=reliable nohz=on nohz_full=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54 rcu_nocbs=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54 irqaffinity=0,1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55 crashkernel=1G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G rcutree.kthread_prio=51 default_hugepagesz=1G hugepagesz=1G hugepages=48 intel_iommu=on iommu=pt intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable # numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 node 0 size: 31257 MB node 0 free: 951 MB node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 node 1 size: 31178 MB node 1 free: 2283 MB node distances: node 0 1 0: 10 20 1: 20 10 The handler threads and revalidator threads of OVS will use 39(31 handlers and 8 revalidators). 8 revalidators calculate by this rule(The default is the number of cpu cores divided by four plus one. 29/4 + 1 ). From default setting of handler, The default value is the number of online CPU cores minus the number of revalidators(if online cpu contain isolation cpu, the total of cpu is 56, so the handler is 56-8=48. If online cpu is housekeeping cpu, the housekeeping cpu is 29). But why handler will use 31? But this allocation will cause the handler thread and the revalidator thread to occupy the same cpu. It will also trigger self dump as below: 2023-07-10T15:11:46.699Z|00025|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 2023-07-10T15:11:50.420Z|00064|ovs_rcu|WARN|blocked 1000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:50.460Z|00001|ovs_rcu(urcu5)|WARN|blocked 1000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:51.420Z|00065|ovs_rcu|WARN|blocked 2000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:51.460Z|00002|ovs_rcu(urcu5)|WARN|blocked 2000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:53.419Z|00066|ovs_rcu|WARN|blocked 4000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:53.460Z|00003|ovs_rcu(urcu5)|WARN|blocked 4000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:57.420Z|00067|ovs_rcu|WARN|blocked 8000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:57.459Z|00004|ovs_rcu(urcu5)|WARN|blocked 8000 ms waiting for revalidator40 to quiesce 2023-07-10T15:11:58.172Z|00004|timeval(revalidator36)|WARN|Unreasonably long 8872ms poll interval (4ms user, 46ms system) 2023-07-10T15:11:58.172Z|00005|timeval(revalidator36)|WARN|context switches: 2052 voluntary, 13 involuntary 2023-07-10T15:11:58.173Z|00050|timeval(revalidator39)|WARN|Unreasonably long 8873ms poll interval (7ms user, 42ms system) 2023-07-10T15:11:58.173Z|00051|timeval(revalidator39)|WARN|context switches: 2056 voluntary, 5 involuntary 2023-07-10T15:11:58.181Z|00004|timeval(revalidator40)|WARN|Unreasonably long 8881ms poll interval (17ms user, 41ms system) 2023-07-10T15:11:58.181Z|00005|timeval(revalidator40)|WARN|context switches: 2078 voluntary, 10 involuntary 2023-07-10T15:11:58.182Z|00050|ofproto_dpif_upcall(revalidator33)|INFO|Spent an unreasonably long 8883ms dumping flows If I still keep 29 cpu as housekeeping cpu, and configure n-revalidator-threads=6. and other_config:n-handler-threads=24. OVS still configure handler to 31. 2023-07-11T02:07:13.277Z|00049|ofproto_dpif_upcall|INFO|Overriding n-handler-threads to 31, setting n-revalidator-threads to 6 2023-07-11T02:07:13.277Z|00050|ofproto_dpif_upcall|INFO|Starting 37 threads ... 2023-07-11T02:13:10.278Z|00066|ovs_rcu|WARN|blocked 1000 ms waiting for revalidator33 to quiesce 2023-07-11T02:13:10.280Z|00002|ovs_rcu(urcu5)|WARN|blocked 1000 ms waiting for revalidator33 to quiesce 2023-07-11T02:13:11.279Z|00067|ovs_rcu|WARN|blocked 2001 ms waiting for revalidator33 to quiesce 2023-07-11T02:13:11.282Z|00003|ovs_rcu(urcu5)|WARN|blocked 2001 ms waiting for revalidator33 to quiesce 2023-07-11T02:13:13.278Z|00068|ovs_rcu|WARN|blocked 4000 ms waiting for revalidator33 to quiesce 2023-07-11T02:13:13.281Z|00004|ovs_rcu(urcu5)|WARN|blocked 4000 ms waiting for revalidator33 to quiesce 2023-07-11T02:13:16.805Z|00005|timeval(revalidator34)|WARN|Unreasonably long 7577ms poll interval (18ms user, 37ms system) 2023-07-11T02:13:16.805Z|00006|timeval(revalidator34)|WARN|context switches: 392 voluntary, 7 involuntary 2023-07-11T02:13:16.805Z|00050|timeval(revalidator38)|WARN|Unreasonably long 7577ms poll interval (33ms user, 22ms system) 2023-07-11T02:13:16.805Z|00051|timeval(revalidator38)|WARN|context switches: 370 voluntary, 9 involuntary 2023-07-11T02:13:16.805Z|00004|timeval(revalidator35)|WARN|Unreasonably long 7577ms poll interval (21ms user, 34ms system) 2023-07-11T02:13:16.805Z|00005|timeval(revalidator35)|WARN|context switches: 399 voluntary, 3 involuntary 2023-07-11T02:13:16.805Z|00049|timeval(revalidator33)|WARN|Unreasonably long 7579ms poll interval (27ms user, 27ms system) 2023-07-11T02:13:16.805Z|00050|timeval(revalidator33)|WARN|context switches: 407 voluntary, 7 involuntary 2023-07-11T02:13:16.805Z|00004|timeval(revalidator36)|WARN|Unreasonably long 7577ms poll interval (31ms user, 24ms system) 2023-07-11T02:13:16.805Z|00005|timeval(revalidator36)|WARN|context switches: 392 voluntary, 8 involuntary 2023-07-11T02:13:16.810Z|00004|timeval(revalidator37)|WARN|Unreasonably long 7582ms poll interval (37ms user, 35ms system) 2023-07-11T02:13:16.810Z|00005|timeval(revalidator37)|WARN|context switches: 425 voluntary, 5 involuntary 2023-07-11T02:13:16.810Z|00051|ofproto_dpif_upcall(revalidator33)|INFO|Spent an unreasonably long 7584ms dumping flows 2023-07-11T02:13:16.811Z|00020|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 11870 flows. 2023-07-11T02:13:16.811Z|00021|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 11870 flows. 2023-07-11T02:13:16.811Z|00022|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 11870 flows. 2023-07-11T02:13:16.811Z|00023|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 11870 flows. 2023-07-11T02:13:16.811Z|00024|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 11870 flows. 2023-07-11T02:13:16.813Z|00052|ofproto_dpif_upcall(revalidator33)|WARN|Number of datapath flows (36560) twice as high as current dynamic flow limit (23740). Starting to delete flows unconditionally as an emergency measure. 2023-07-11T02:13:16.938Z|00025|dpif_netlink(handler1)|WARN|Dropped 4 log messages in last 9 seconds (most recently, 1 seconds ago) due to excessive rate 2023-07-11T02:13:16.938Z|00026|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 Then I reallocated housekeeping cpu to 4. the cpu0,1,2,3 as housekeeping cpu. Try to assign n-revalidator-threads=1 and n-handler-threads=4. But still no luck. # pidstat -t -p $(pidof ovs-vswitchd) Linux 4.18.0-499.rt7.288.el8.x86_64 (dell-per750-03.rhts.eng.pek2.redhat.com) 07/10/2023 _x86_64_ (56 CPU) 10:49:41 PM UID TGID TID %usr %system %guest %wait %CPU CPU Command 10:49:41 PM 994 3939 - 6.54 9.84 0.00 0.17 16.38 3 ovs-vswitchd 10:49:41 PM 994 - 3939 0.05 0.07 0.00 0.17 0.12 3 |__ovs-vswitchd 10:49:41 PM 994 - 3944 0.08 0.08 0.00 0.41 0.16 1 |__urcu4 10:49:41 PM 994 - 4766 3.30 4.64 0.00 1.11 7.94 2 |__handler13 10:49:41 PM 994 - 4767 0.00 0.00 0.00 0.00 0.00 2 |__handler14 10:49:41 PM 994 - 4768 0.00 0.00 0.00 0.00 0.00 3 |__handler18 10:49:41 PM 994 - 4769 0.00 0.00 0.00 0.00 0.00 2 |__handler15 10:49:41 PM 994 - 4770 0.00 0.00 0.00 0.00 0.00 0 |__handler16 10:49:41 PM 994 - 4771 3.10 5.03 0.00 0.49 8.13 3 |__revalidator17 2023-07-11T02:43:16.832Z|00064|ofproto_dpif_upcall|INFO|Overriding n-handler-threads to 5, setting n-revalidator-threads to 1 2023-07-11T02:43:16.833Z|00065|ofproto_dpif_upcall|INFO|Starting 6 threads ..... 2023-07-11T02:49:15.036Z|00256|ofproto_dpif_upcall(revalidator17)|INFO|Spent an unreasonably long 3473ms dumping flows 2023-07-11T02:49:15.036Z|00257|timeval(revalidator17)|WARN|Unreasonably long 3473ms poll interval (765ms user, 2333ms system) 2023-07-11T02:49:15.036Z|00258|timeval(revalidator17)|WARN|context switches: 844 voluntary, 2481 involuntary 2023-07-11T02:49:15.036Z|00259|coverage(revalidator17)|INFO|Dropped 12 log messages in last 58 seconds (most recently, 4 seconds ago) due to excessive rate 2023-07-11T02:49:15.037Z|00260|coverage(revalidator17)|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=ff3c5a2e: 2023-07-11T02:49:15.037Z|00261|coverage(revalidator17)|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0064/sec total: 23 2023-07-11T02:49:15.037Z|00262|coverage(revalidator17)|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-07-11T02:49:15.037Z|00263|coverage(revalidator17)|INFO|ofproto_update_port 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-07-11T02:49:15.037Z|00264|coverage(revalidator17)|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0006/sec total: 2 2023-07-11T02:49:15.037Z|00265|coverage(revalidator17)|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-07-11T02:49:15.037Z|00266|coverage(revalidator17)|INFO|rev_mac_learning 6.0/sec 24.450/sec 0.6047/sec total: 2178 2023-07-11T02:49:15.037Z|00267|coverage(revalidator17)|INFO|dumped_duplicate_flow 29.4/sec 7.950/sec 0.3242/sec total: 1169 2023-07-11T02:49:15.037Z|00268|coverage(revalidator17)|INFO|handler_duplicate_upcall 0.0/sec 246.567/sec 14.6642/sec total: 52791 2023-07-11T02:49:15.037Z|00269|coverage(revalidator17)|INFO|revalidate_missed_dp_flow 28.4/sec 99.050/sec 4.8175/sec total: 17345 2023-07-11T02:49:15.037Z|00270|coverage(revalidator17)|INFO|upcall_flow_limit_hit 17643.2/sec 36686.800/sec 1366.5842/sec total: 4975609 2023-07-11T02:49:15.037Z|00271|coverage(revalidator17)|INFO|upcall_flow_limit_kill 0.0/sec 23.733/sec 0.4039/sec total: 1454 2023-07-11T02:49:15.037Z|00272|coverage(revalidator17)|INFO|upcall_ukey_replace 1139.2/sec 194.733/sec 4.0975/sec total: 14751 2023-07-11T02:49:15.037Z|00273|coverage(revalidator17)|INFO|xlate_actions 77030.4/sec 87645.817/sec 2787.3842/sec total: 10123329 2023-07-11T02:49:15.037Z|00274|coverage(revalidator17)|INFO|cmap_expand 68.4/sec 39.833/sec 1.5558/sec total: 5601 2023-07-11T02:49:15.037Z|00275|coverage(revalidator17)|INFO|cmap_shrink 0.0/sec 42.383/sec 0.8811/sec total: 3172 2023-07-11T02:49:15.037Z|00276|coverage(revalidator17)|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-07-11T02:49:15.037Z|00277|coverage(revalidator17)|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0067/sec total: 24 2023-07-11T02:49:15.037Z|00278|coverage(revalidator17)|INFO|dpif_flow_put 7192.6/sec 6503.000/sec 176.2228/sec total: 661069 2023-07-11T02:49:15.037Z|00279|coverage(revalidator17)|INFO|dpif_flow_del 3754.6/sec 4278.800/sec 107.4194/sec total: 387803 2023-07-11T02:49:15.037Z|00280|coverage(revalidator17)|INFO|dpif_execute 23175.0/sec 40768.233/sec 1491.0572/sec total: 5423024 2023-07-11T02:49:15.037Z|00281|coverage(revalidator17)|INFO|flow_extract 23175.0/sec 40768.233/sec 1491.0567/sec total: 5423022 2023-07-11T02:49:15.037Z|00282|coverage(revalidator17)|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.0908/sec total: 327 2023-07-11T02:49:15.037Z|00283|coverage(revalidator17)|INFO|hmap_pathological 0.0/sec 0.000/sec 0.0094/sec total: 34 2023-07-11T02:49:15.037Z|00284|coverage(revalidator17)|INFO|hmap_expand 90.4/sec 313.167/sec 9.6281/sec total: 34692 2023-07-11T02:49:15.037Z|00285|coverage(revalidator17)|INFO|mac_learning_learned 2488.8/sec 6419.517/sec 224.7356/sec total: 873384 2023-07-11T02:49:15.037Z|00286|coverage(revalidator17)|INFO|mac_learning_expired 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-07-11T02:49:15.037Z|00287|coverage(revalidator17)|INFO|mac_learning_evicted 2488.8/sec 6419.517/sec 209.0731/sec total: 816999 2023-07-11T02:49:15.037Z|00288|coverage(revalidator17)|INFO|netdev_get_stats 1.2/sec 1.200/sec 0.1683/sec total: 612 2023-07-11T02:49:15.037Z|00289|coverage(revalidator17)|INFO|txn_unchanged 0.0/sec 0.000/sec 0.0297/sec total: 107 2023-07-11T02:49:15.037Z|00290|coverage(revalidator17)|INFO|txn_incomplete 0.2/sec 0.233/sec 0.0153/sec total: 56 2023-07-11T02:49:15.037Z|00291|coverage(revalidator17)|INFO|txn_success 0.2/sec 0.200/sec 0.0111/sec total: 40 2023-07-11T02:49:15.037Z|00292|coverage(revalidator17)|INFO|poll_create_node 5854.0/sec 6309.383/sec 173.3708/sec total: 666896 2023-07-11T02:49:15.037Z|00293|coverage(revalidator17)|INFO|poll_zero_timeout 906.0/sec 1270.933/sec 39.9678/sec total: 155473 2023-07-11T02:49:15.037Z|00294|coverage(revalidator17)|INFO|seq_change 4398.6/sec 4638.800/sec 227.4747/sec total: 838586 2023-07-11T02:49:15.037Z|00295|coverage(revalidator17)|INFO|pstream_open 0.0/sec 0.000/sec 0.0014/sec total: 5 2023-07-11T02:49:15.037Z|00296|coverage(revalidator17)|INFO|stream_open 0.0/sec 0.000/sec 0.0003/sec total: 1 2023-07-11T02:49:15.037Z|00297|coverage(revalidator17)|INFO|unixctl_received 0.0/sec 0.050/sec 0.0047/sec total: 17 2023-07-11T02:49:15.037Z|00298|coverage(revalidator17)|INFO|unixctl_replied 0.0/sec 0.050/sec 0.0047/sec total: 17 2023-07-11T02:49:15.037Z|00299|coverage(revalidator17)|INFO|util_xalloc 147822.0/sec 208055.067/sec 6811.8719/sec total: 25118870 2023-07-11T02:49:15.037Z|00300|coverage(revalidator17)|INFO|netdev_set_policing 0.0/sec 0.000/sec 0.0383/sec total: 138 2023-07-11T02:49:15.037Z|00301|coverage(revalidator17)|INFO|netdev_get_ethtool 0.0/sec 0.000/sec 0.0028/sec total: 10 2023-07-11T02:49:15.037Z|00302|coverage(revalidator17)|INFO|netlink_overflow 1.0/sec 2.167/sec 0.1000/sec total: 360 2023-07-11T02:49:15.037Z|00303|coverage(revalidator17)|INFO|netlink_received 42786.2/sec 63483.933/sec 2074.7633/sec total: 7633708 2023-07-11T02:49:15.037Z|00304|coverage(revalidator17)|INFO|netlink_recv_jumbo 3.0/sec 3.950/sec 0.2350/sec total: 851 2023-07-11T02:49:15.037Z|00305|coverage(revalidator17)|INFO|netlink_sent 42308.2/sec 63052.800/sec 2062.8081/sec total: 7590265 2023-07-11T02:49:15.037Z|00306|coverage(revalidator17)|INFO|nln_changed 0.0/sec 0.000/sec 0.0083/sec total: 30 2023-07-11T02:49:15.037Z|00307|coverage(revalidator17)|INFO|107 events never hit 2023-07-11T02:49:16.439Z|00308|timeval(revalidator17)|WARN|Unreasonably long 1403ms poll interval (270ms user, 967ms system) 2023-07-11T02:49:16.439Z|00309|timeval(revalidator17)|WARN|context switches: 846 voluntary, 939 involuntary 2023-07-11T02:49:17.253Z|00081|dpif_netlink(handler13)|WARN|Dropped 7 log messages in last 11 seconds (most recently, 6 seconds ago) due to excessive rate 2023-07-11T02:49:17.253Z|00082|dpif_netlink(handler13)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T02:49:18.814Z|00310|poll_loop(revalidator17)|INFO|Dropped 7848 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-07-11T02:49:18.814Z|00311|poll_loop(revalidator17)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:1044 (91% CPU usage) 2023-07-11T02:49:20.673Z|00312|timeval(revalidator17)|WARN|Unreasonably long 1117ms poll interval (268ms user, 750ms system) 2023-07-11T02:49:20.673Z|00313|timeval(revalidator17)|WARN|context switches: 297 voluntary, 757 involuntary 2023-07-11T02:49:21.706Z|00314|timeval(revalidator17)|WARN|Unreasonably long 1033ms poll interval (252ms user, 652ms system) 2023-07-11T02:49:21.706Z|00315|timeval(revalidator17)|WARN|context switches: 276 voluntary, 741 involuntary 2023-07-11T02:49:24.315Z|00316|poll_loop(revalidator17)|INFO|Dropped 16686 log messages in last 5 seconds (most recently, 0 seconds ago) due to excessive rate 2023-07-11T02:49:24.315Z|00317|poll_loop(revalidator17)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:1044 (91% CPU usage) 2023-07-11T02:49:28.496Z|00318|ofproto_dpif_upcall(revalidator17)|INFO|Spent an unreasonably long 3323ms dumping flows 2023-07-11T02:49:28.496Z|00319|timeval(revalidator17)|WARN|Unreasonably long 3323ms poll interval (671ms user, 2281ms system) 2023-07-11T02:49:28.496Z|00320|timeval(revalidator17)|WARN|context switches: 770 voluntary, 2059 involuntary 2023-07-11T02:49:30.202Z|00321|timeval(revalidator17)|WARN|Unreasonably long 1706ms poll interval (365ms user, 1093ms system) 2023-07-11T02:49:30.202Z|00322|timeval(revalidator17)|WARN|context switches: 539 voluntary, 1178 involuntary 2023-07-11T02:49:30.296Z|00083|poll_loop(handler13)|INFO|Dropped 33895 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-07-11T02:49:30.296Z|00084|poll_loop(handler13)|INFO|wakeup due to [POLLIN] on fd 27 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif-netlink.c:3195 (56% CPU usage) 2023-07-11T02:49:31.728Z|00085|dpif_netlink(handler13)|WARN|Dropped 4 log messages in last 13 seconds (most recently, 6 seconds ago) due to excessive rate 2023-07-11T02:49:31.728Z|00086|dpif_netlink(handler13)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T02:49:36.297Z|00087|poll_loop(handler13)|INFO|Dropped 13725 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2023-07-11T02:49:36.297Z|00088|poll_loop(handler13)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:816 (87% CPU usage)
From openvswitch3.1-3.1.0-39.el8fdp, I can also see more handler on housekeeping cpu. Current isolated cpu is 0,1,2,3. and I set n-revalidator-threads to 1. Here is ovs other_config # ovs-vsctl list Open_vSwitch . _uuid : 3136b9a9-8ba9-4e41-9100-fa4357d1f619 bridges : [736b9ace-b662-4ed1-92b3-db3e9ec534c8, 884bc551-07b3-4505-bf53-16c5c2b6ae1a] cur_cfg : 40 datapath_types : [netdev, system] datapaths : {} db_version : "8.3.1" dpdk_initialized : false dpdk_version : "DPDK 22.11.1" external_ids : {hostname=dell-per750-03.rhts.eng.pek2.redhat.com, rundir="/var/run/openvswitch", system-id="fd5a0088-c85a-4746-9940-5c29a820020b"} iface_types : [bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] manager_options : [] next_cfg : 40 other_config : {hw-offload="true", max-idle="3600000", max-revalidator="3600000", n-revalidator-threads="1", tc-policy=none} ovs_version : "3.1.2" ssl : [] statistics : {} system_type : rhel system_version : "8.9" I can see ovs start 5 handler. almost handler host on cpu1. 2023-07-11T06:48:15.005Z|00049|ofproto_dpif_upcall|INFO|Overriding n-handler-threads to 5, setting n-revalidator-threads to 1 2023-07-11T06:48:15.005Z|00050|ofproto_dpif_upcall|INFO|Starting 6 threads # pidstat -t -p $(pidof ovs-vswitchd) Linux 4.18.0-499.rt7.288.el8.x86_64 (dell-per750-03.rhts.eng.pek2.redhat.com) 07/11/2023 _x86_64_ (56 CPU) 02:48:31 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 02:48:31 AM 994 10030 - 0.00 0.00 0.00 0.00 0.00 2 ovs-vswitchd 02:48:31 AM 994 - 10030 0.00 0.00 0.00 0.00 0.00 2 |__ovs-vswitchd 02:48:31 AM 994 - 10032 0.00 0.00 0.00 0.00 0.00 1 |__handler1 02:48:31 AM 994 - 10033 0.00 0.00 0.00 0.00 0.00 1 |__handler2 02:48:31 AM 994 - 10034 0.00 0.00 0.00 0.00 0.00 1 |__handler3 02:48:31 AM 994 - 10035 0.00 0.00 0.00 0.00 0.00 2 |__handler4 02:48:31 AM 994 - 10036 0.00 0.00 0.00 0.00 0.00 3 |__urcu5 02:48:31 AM 994 - 10037 0.00 0.00 0.00 0.00 0.00 3 |__handler6 02:48:31 AM 994 - 10038 0.00 0.00 0.00 0.00 0.00 3 |__revalidator7 2023-07-11T03:31:02.749Z|00063|memory|INFO|handlers:5 idl-cells-Open_vSwitch:403 ports:6 revalidators:1 rules:10 udpif keys:2 2023-07-11T06:39:35.532Z|00001|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T06:39:36.478Z|00002|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T06:39:36.990Z|00003|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T06:39:37.448Z|00004|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T06:39:37.965Z|00005|dpif_netlink(handler1)|WARN|system@ovs-system: lost packet on handler 0 2023-07-11T06:39:38.078Z|00006|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.082Z|00007|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.089Z|00008|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.100Z|00009|poll_loop(handler1)|INFO|wakeup due to 1-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.117Z|00010|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.122Z|00011|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.128Z|00012|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.135Z|00013|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.140Z|00014|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:38.144Z|00015|poll_loop(handler1)|INFO|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:817 (87% CPU usage) 2023-07-11T06:39:39.790Z|00016|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 48192 flows. 2023-07-11T06:39:39.790Z|00017|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 48192 flows. 2023-07-11T06:39:39.790Z|00018|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 48192 flows. 2023-07-11T06:39:39.790Z|00019|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 48192 flows. 2023-07-11T06:39:39.790Z|00020|ofproto_dpif_upcall(handler1)|WARN|upcall: datapath reached the dynamic limit of 48192 flows. 2023-07-11T06:39:40.692Z|00001|ofproto_dpif_upcall(revalidator6)|INFO|Spent an unreasonably long 2228ms dumping flows 2023-07-11T06:39:40.692Z|00002|timeval(revalidator6)|WARN|Unreasonably long 2228ms poll interval (498ms user, 1287ms system) 2023-07-11T06:39:40.692Z|00003|timeval(revalidator6)|WARN|faults: 2578 minor, 0 major 2023-07-11T06:39:40.692Z|00004|timeval(revalidator6)|WARN|context switches: 529 voluntary, 2149 involuntary 2023-07-11T06:39:40.692Z|00005|coverage(revalidator6)|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=ee7e0b89: 2023-07-11T06:39:40.692Z|00006|coverage(revalidator6)|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0000/sec total: 17 2023-07-11T06:39:40.692Z|00007|coverage(revalidator6)|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0000/sec total: 2 2023-07-11T06:39:40.692Z|00008|coverage(revalidator6)|INFO|ofproto_update_port 0.0/sec 0.000/sec 0.0000/sec total: 11 2023-07-11T06:39:40.692Z|00009|coverage(revalidator6)|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0000/sec total: 2 2023-07-11T06:39:40.692Z|00010|coverage(revalidator6)|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0000/sec total: 1 2023-07-11T06:39:40.692Z|00011|coverage(revalidator6)|INFO|rev_mac_learning 0.0/sec 0.017/sec 0.0003/sec total: 6 2023-07-11T06:39:40.692Z|00012|coverage(revalidator6)|INFO|dumped_duplicate_flow 0.0/sec 0.000/sec 0.0000/sec total: 230 2023-07-11T06:39:40.692Z|00013|coverage(revalidator6)|INFO|handler_duplicate_upcall 1396.2/sec 118.417/sec 1.9736/sec total: 9429 2023-07-11T06:39:40.692Z|00014|coverage(revalidator6)|INFO|upcall_flow_limit_hit 0.0/sec 0.000/sec 0.0000/sec total: 9830 2023-07-11T06:39:40.692Z|00015|coverage(revalidator6)|INFO|upcall_ukey_replace 6.2/sec 0.517/sec 0.0086/sec total: 31 2023-07-11T06:39:40.692Z|00016|coverage(revalidator6)|INFO|xlate_actions 8633.6/sec 721.717/sec 12.0286/sec total: 111399 2023-07-11T06:39:40.692Z|00017|coverage(revalidator6)|INFO|cmap_expand 473.2/sec 39.550/sec 0.6592/sec total: 2742 2023-07-11T06:39:40.692Z|00018|coverage(revalidator6)|INFO|cmap_shrink 0.0/sec 0.050/sec 0.0008/sec total: 159 2023-07-11T06:39:40.692Z|00019|coverage(revalidator6)|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0000/sec total: 1 2023-07-11T06:39:40.692Z|00020|coverage(revalidator6)|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0000/sec total: 24 2023-07-11T06:39:40.692Z|00021|coverage(revalidator6)|INFO|dpif_flow_put 7237.4/sec 603.267/sec 10.0544/sec total: 58612 2023-07-11T06:39:40.692Z|00022|coverage(revalidator6)|INFO|dpif_flow_del 0.0/sec 0.000/sec 0.0000/sec total: 4698 2023-07-11T06:39:40.692Z|00023|coverage(revalidator6)|INFO|dpif_execute 8633.6/sec 721.650/sec 12.0275/sec total: 69206 2023-07-11T06:39:40.692Z|00024|coverage(revalidator6)|INFO|flow_extract 8633.6/sec 721.650/sec 12.0275/sec total: 69204 2023-07-11T06:39:40.692Z|00025|coverage(revalidator6)|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.0000/sec total: 361 2023-07-11T06:39:40.692Z|00026|coverage(revalidator6)|INFO|hmap_pathological 6.8/sec 0.567/sec 0.0094/sec total: 36 2023-07-11T06:39:40.692Z|00027|coverage(revalidator6)|INFO|hmap_expand 13.4/sec 5.467/sec 1.3828/sec total: 17347 2023-07-11T06:39:40.692Z|00028|coverage(revalidator6)|INFO|mac_learning_learned 7229.0/sec 602.483/sec 10.0414/sec total: 54242 2023-07-11T06:39:40.692Z|00029|coverage(revalidator6)|INFO|mac_learning_expired 0.0/sec 0.000/sec 0.0000/sec total: 3 2023-07-11T06:39:40.692Z|00030|coverage(revalidator6)|INFO|netdev_get_stats 1.2/sec 1.200/sec 1.1917/sec total: 13590 2023-07-11T06:39:40.692Z|00031|coverage(revalidator6)|INFO|txn_unchanged 0.0/sec 0.100/sec 0.1969/sec total: 2290 2023-07-11T06:39:40.692Z|00032|coverage(revalidator6)|INFO|txn_incomplete 0.2/sec 0.117/sec 0.0019/sec total: 22 2023-07-11T06:39:40.692Z|00033|coverage(revalidator6)|INFO|txn_success 0.0/sec 0.083/sec 0.0014/sec total: 18 2023-07-11T06:39:40.692Z|00034|coverage(revalidator6)|INFO|poll_create_node 427.8/sec 52.533/sec 8.6267/sec total: 100040 2023-07-11T06:39:40.692Z|00035|coverage(revalidator6)|INFO|poll_zero_timeout 137.0/sec 11.533/sec 0.1922/sec total: 1130 2023-07-11T06:39:40.692Z|00036|coverage(revalidator6)|INFO|seq_change 352.6/sec 38.850/sec 1.0064/sec total: 14204 2023-07-11T06:39:40.692Z|00037|coverage(revalidator6)|INFO|pstream_open 0.0/sec 0.000/sec 0.0000/sec total: 5 2023-07-11T06:39:40.692Z|00038|coverage(revalidator6)|INFO|stream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2023-07-11T06:39:40.692Z|00039|coverage(revalidator6)|INFO|unixctl_received 0.0/sec 0.067/sec 0.0011/sec total: 5 2023-07-11T06:39:40.692Z|00040|coverage(revalidator6)|INFO|unixctl_replied 0.0/sec 0.067/sec 0.0011/sec total: 5 2023-07-11T06:39:40.692Z|00041|coverage(revalidator6)|INFO|util_xalloc 63928.6/sec 5466.950/sec 147.7958/sec total: 1368392 2023-07-11T06:39:40.692Z|00042|coverage(revalidator6)|INFO|netdev_set_policing 0.0/sec 0.000/sec 0.0000/sec total: 102 2023-07-11T06:39:40.692Z|00043|coverage(revalidator6)|INFO|netdev_get_ethtool 0.0/sec 0.000/sec 0.0000/sec total: 11 2023-07-11T06:39:40.692Z|00044|coverage(revalidator6)|INFO|netlink_overflow 1.0/sec 0.083/sec 0.0014/sec total: 8 2023-07-11T06:39:40.692Z|00045|coverage(revalidator6)|INFO|netlink_received 15889.8/sec 1329.050/sec 24.1461/sec total: 188091 2023-07-11T06:39:40.692Z|00046|coverage(revalidator6)|INFO|netlink_recv_jumbo 0.8/sec 0.800/sec 0.7944/sec total: 9207 2023-07-11T06:39:40.692Z|00047|coverage(revalidator6)|INFO|netlink_sent 15884.2/sec 1328.350/sec 24.1144/sec total: 187219 2023-07-11T06:39:40.692Z|00048|coverage(revalidator6)|INFO|nln_changed 0.0/sec 0.000/sec 0.0000/sec total: 31 2023-07-11T06:39:40.692Z|00049|coverage(revalidator6)|INFO|108 events never hit 2023-07-11T06:39:40.696Z|00050|ofproto_dpif_upcall(revalidator6)|WARN|Number of datapath flows (48608) twice as high as current dynamic flow limit (48192). Starting to delete flows unconditionally as an emergency measure. 2023-07-11T06:39:41.739Z|00051|timeval(revalidator6)|WARN|Unreasonably long 1046ms poll interval (336ms user, 619ms system) 2023-07-11T06:39:41.739Z|00052|timeval(revalidator6)|WARN|faults: 158 minor, 0 major 2023-07-11T06:39:41.739Z|00053|timeval(revalidator6)|WARN|context switches: 234 voluntary, 559 involuntary
Base on comment6, I change n-revalidator-threads and n-handler-threads by ovs-vsctl. But only n-revalidator-threads take effect. > If I still keep 29 cpu as housekeeping cpu, and configure n-revalidator-threads=6. and other_config:n-handler-threads=24. OVS still configure handler to 31. > 2023-07-11T02:07:13.277Z|00049|ofproto_dpif_upcall|INFO|Overriding n-handler-threads to 31, setting n-revalidator-threads to 6 > 2023-07-11T02:07:13.277Z|00050|ofproto_dpif_upcall|INFO|Starting 37 threads Base on comment7, If I only change n-revalidator-threads to 1, the handler thread still can't keep 1:1 ratio with cpu. I can see ovs start 5 handler. almost handler host on cpu1. > 2023-07-11T06:48:15.005Z|00049|ofproto_dpif_upcall|INFO|Overriding n-handler-threads to 5, setting n-revalidator-threads to 1 > 2023-07-11T06:48:15.005Z|00050|ofproto_dpif_upcall|INFO|Starting 6 threads > # pidstat -t -p $(pidof ovs-vswitchd) > Linux 4.18.0-499.rt7.288.el8.x86_64 (dell-per750-03.rhts.eng.pek2.redhat.com) 07/11/2023 _x86_64_ (56 CPU) > 02:48:31 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command > 02:48:31 AM 994 10030 - 0.00 0.00 0.00 0.00 0.00 2 ovs-vswitchd > 02:48:31 AM 994 - 10030 0.00 0.00 0.00 0.00 0.00 2 |__ovs-vswitchd > 02:48:31 AM 994 - 10032 0.00 0.00 0.00 0.00 0.00 1 |__handler1 > 02:48:31 AM 994 - 10033 0.00 0.00 0.00 0.00 0.00 1 |__handler2 > 02:48:31 AM 994 - 10034 0.00 0.00 0.00 0.00 0.00 1 |__handler3 > 02:48:31 AM 994 - 10035 0.00 0.00 0.00 0.00 0.00 2 |__handler4 > 02:48:31 AM 994 - 10036 0.00 0.00 0.00 0.00 0.00 3 |__urcu5 > 02:48:31 AM 994 - 10037 0.00 0.00 0.00 0.00 0.00 3 |__handler6 > 02:48:31 AM 994 - 10038 0.00 0.00 0.00 0.00 0.00 3 |__revalidator7 if I use taskset to pin ovs threads to isolated cpu. ovs will generator new handler threads. # pidstat -t -p $(pidof ovs-vswitchd) Linux 4.18.0-499.rt7.288.el8.x86_64 (dell-per750-03.rhts.eng.pek2.redhat.com) 07/11/2023 _x86_64_ (56 CPU) 03:03:55 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 03:03:55 AM 994 10805 - 0.00 0.00 0.00 0.00 0.00 1 ovs-vswitchd 03:03:55 AM 994 - 10805 0.00 0.00 0.00 0.00 0.00 1 |__ovs-vswitchd 03:03:55 AM 994 - 10807 0.00 0.00 0.00 0.00 0.00 2 |__handler4 03:03:55 AM 994 - 10808 0.00 0.00 0.00 0.00 0.00 2 |__handler2 03:03:55 AM 994 - 10809 0.00 0.00 0.00 0.00 0.00 1 |__handler1 03:03:55 AM 994 - 10810 0.00 0.00 0.00 0.00 0.00 1 |__handler3 03:03:55 AM 994 - 10811 0.00 0.00 0.00 0.00 0.00 2 |__handler5 03:03:55 AM 994 - 10812 0.00 0.00 0.00 0.00 0.00 3 |__revalidator6 03:03:55 AM 994 - 10813 0.00 0.00 0.00 0.00 0.00 3 |__urcu7 # taskset -a -p 3,5,7,9,11,13 10805 find the handler threads increase to 17 2023-07-11T07:03:48.979Z|00061|bridge|INFO|ovs-vswitchd (Open vSwitch) 3.1.2 2023-07-11T07:03:58.989Z|00062|memory|INFO|174208 kB peak resident set size after 10.3 seconds 2023-07-11T07:03:58.989Z|00063|memory|INFO|handlers:5 idl-cells-Open_vSwitch:403 ports:6 revalidators:1 rules:10 2023-07-11T07:04:49.034Z|00064|ofproto_dpif_upcall|INFO|Overriding n-handler-threads to 17, setting n-revalidator-threads to 1 # pidstat -t -p $(pidof ovs-vswitchd) Linux 4.18.0-499.rt7.288.el8.x86_64 (dell-per750-03.rhts.eng.pek2.redhat.com) 07/11/2023 _x86_64_ (56 CPU) 03:04:59 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 03:04:59 AM 994 10805 - 0.00 0.00 0.00 0.00 0.00 1 ovs-vswitchd 03:04:59 AM 994 - 10805 0.00 0.00 0.00 0.00 0.00 1 |__ovs-vswitchd 03:04:59 AM 994 - 10813 0.00 0.00 0.00 0.00 0.00 3 |__urcu7 03:04:59 AM 994 - 10847 0.00 0.00 0.00 0.00 0.00 0 |__handler8 03:04:59 AM 994 - 10848 0.00 0.00 0.00 0.00 0.00 0 |__handler9 03:04:59 AM 994 - 10849 0.00 0.00 0.00 0.00 0.00 0 |__handler10 03:04:59 AM 994 - 10850 0.00 0.00 0.00 0.00 0.00 0 |__handler11 03:04:59 AM 994 - 10851 0.00 0.00 0.00 0.00 0.00 0 |__handler12 03:04:59 AM 994 - 10852 0.00 0.00 0.00 0.00 0.00 0 |__handler13 03:04:59 AM 994 - 10853 0.00 0.00 0.00 0.00 0.00 0 |__handler14 03:04:59 AM 994 - 10854 0.00 0.00 0.00 0.00 0.00 1 |__handler15 03:04:59 AM 994 - 10855 0.00 0.00 0.00 0.00 0.00 1 |__handler16 03:04:59 AM 994 - 10856 0.00 0.00 0.00 0.00 0.00 0 |__handler17 03:04:59 AM 994 - 10857 0.00 0.00 0.00 0.00 0.00 1 |__handler18 03:04:59 AM 994 - 10858 0.00 0.00 0.00 0.00 0.00 1 |__handler22 03:04:59 AM 994 - 10859 0.00 0.00 0.00 0.00 0.00 1 |__handler19 03:04:59 AM 994 - 10860 0.00 0.00 0.00 0.00 0.00 1 |__handler21 03:04:59 AM 994 - 10861 0.00 0.00 0.00 0.00 0.00 0 |__handler20 03:04:59 AM 994 - 10862 0.00 0.00 0.00 0.00 0.00 0 |__handler23 03:04:59 AM 994 - 10863 0.00 0.00 0.00 0.00 0.00 0 |__handler24 03:04:59 AM 994 - 10864 0.00 0.00 0.00 0.00 0.00 1 |__revalidator25
So this problem happens when something in the testbed is not configured correctly, and traffic is initially flooded. When the test is failing I see this kind of entries: recirc_id(0),in_port(2),eth(src=00:00:00:00:19:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:25:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:20:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:27:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:23:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:21:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:1f:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:28:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 recirc_id(0),in_port(2),eth(src=00:00:00:00:26:01,dst=20:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:1,3 The problem is not happening with the following QA tools settings: ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=1 --max-loss-pct=0.002 --search-runtime=10 \ --validation-runtime=60 --rate-tolerance=10 --runtime-tolerance=10 --rate=25 --rate-unit=% \ --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-trial --warmup-trial-runtime=10 \ --rate=0.1 --rate-unit=mpps --one-shot=1 --src-macs=00:00:00:00:00:01,00:00:00:00:00:02 \ --dst-macs=10:00:00:00:00:00,20:00:00:00:00:00 --use-device-stats ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=10000 --max-loss-pct=0.002 \ --search-runtime=10 --validation-runtime=60 --rate-tolerance=10 --runtime-tolerance=10 \ --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-trial --warmup-trial-runtime=10 \ --rate=25 --rate-unit=% --one-shot=0 --src-macs=00:00:00:00:00:01,00:00:00:00:00:02 \ --dst-macs=00:00:00:00:00:02,00:00:00:00:00:01 --use-src-ip-flows=0 --use-dst-ip-flows=0 --use-src-mac-flows=1 \ --use-dst-mac-flows=0 --warmup-trial --warmup-trial-runtime=30 --max-retries=10 --sniff-runtime=10 --warmup-trial \ --warmup-trial-runtime=10 --send-teaching-measurement --send-teaching-warmup --teaching-warmup-packet-type=generic \ --teaching-warmup-packet-rate=10000 --teaching-measurement-packet-type=generic --use-device-stats But it's happening with the following: ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=1 --max-loss-pct=0.002 --search-runtime=10 --validation-runtime=10 --rate-tolerance=10 --runtime-tolerance=10 --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-tri al --warmup-trial-runtime=10 --rate=0.1 --rate-unit=mpps --one-shot=1 --src-macs=00:00:00:00:00:01,00:00:00:00:00:02 --dst-macs=00:00:00:00:00:02,00:00:00:00:00:01 --use-device-stats ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=10000 --max-loss-pct=0.002 --search-runtime=10 --validation-runtime=10 --rate-tolerance=10 --runtime-tolerance=10 --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup -trial --warmup-trial-runtime=10 --rate=25 --rate-unit=% --one-shot=0 --src-macs=10:00:00:00:00:00,20:00:00:00:00:00 --dst-macs=00:00:00:00:00:02,00:00:00:00:00:01 --use-src-ip-flows=0 --use-dst-ip-flows=0 --use-src-mac-flows=1 --use-dst-mac-flows=0 --warmup-trial --warmup-tria l-runtime=30 --sniff-runtime=10 --warmup-trial --warmup-trial-runtime=10 --send-teaching-measurement --send-teaching-warmup --teaching-warmup-packet-type=generic --max-retries=10 --teaching-warmup-packet-rate=10000 --teaching-measurement-packet-type=generic --use-device-stats I let QA figure out how to ensure the proper FDB entries exist so not traffic gets flooded.
As follows Eelco suggestions, performance test didn't hit this issue.