Description of problem: [ 3618.220187] mlx5_core 0000:3b:00.2: enabling device (0000 -> 0002) [ 3618.226534] mlx5_core 0000:3b:00.2: firmware version: 16.31.1014 [ 3618.419782] mlx5_core 0000:3b:00.2: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps [ 3618.439904] mlx5_core 0000:3b:00.2: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0) [ 3618.560072] mlx5_core 0000:3b:00.2: Supported tc offload range - chains: 1, prios: 16 [ 3618.567921] mlx5_core 0000:3b:00.2: mlx5_tc_ct_init:2146:(pid 78892): tc ct offload not supported, firmware level support is missing [ 3618.588194] mlx5_core 0000:3b:00.2 enp59s0f0v0: renamed from eth2 [ 3618.712584] mlx5_core 0000:3b:00.2 enp59s0f0v0: Link up [ 3618.832076] device eth0 left promiscuous mode [ 3618.836677] device enp59s0f0np0 left promiscuous mode [ 3618.841801] device ovsbr0 left promiscuous mode [ 3618.848475] IPv6: ADDRCONF(NETDEV_CHANGE): enp59s0f0v0: link becomes ready [ 3618.877200] device ovs-system left promiscuous mode [ 3620.491395] pci 0000:3b:00.2: Removing from iommu group 150 [ 3620.497130] pci 0000:3b:00.3: Removing from iommu group 151 [ 3621.545238] mlx5_core 0000:3b:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3) [ 3622.165863] mlx5_core 0000:3b:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0) [ 3622.370685] mlx5_core 0000:3b:00.0: Supported tc offload range - chains: 4294967294, prios: 4294967295 [ 3622.837668] mlx5_core 0000:3b:00.0 enp59s0f0np0: Link up [ 3638.612107] mlx5_core 0000:3b:00.0: E-Switch: Enable: mode(LEGACY), nvfs(2), active vports(3) [ 3638.727549] pci 0000:3b:00.2: [15b3:1018] type 00 class 0x020000 [ 3638.733641] pci 0000:3b:00.2: enabling Extended Tags [ 3638.739776] pci 0000:3b:00.2: Adding to iommu group 150 [ 3638.746033] mlx5_core 0000:3b:00.2: enabling device (0000 -> 0002) [ 3638.752362] mlx5_core 0000:3b:00.2: firmware version: 16.31.1014 [ 3638.946724] mlx5_core 0000:3b:00.2: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps [ 3638.966916] mlx5_core 0000:3b:00.2: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0) [ 3639.129545] mlx5_core 0000:3b:00.2: Supported tc offload range - chains: 1, prios: 16 [ 3639.137389] mlx5_core 0000:3b:00.2: mlx5_tc_ct_init:2146:(pid 78892): tc ct offload not supported, firmware level support is missing [ 3639.157274] mlx5_core 0000:3b:00.2 enp59s0f0v0: renamed from eth0 [ 3639.190393] pci 0000:3b:00.3: [15b3:1018] type 00 class 0x020000 [ 3639.196500] pci 0000:3b:00.3: enabling Extended Tags [ 3639.202654] pci 0000:3b:00.3: Adding to iommu group 151 [ 3639.208455] mlx5_core 0000:3b:00.3: enabling device (0000 -> 0002) [ 3639.214787] mlx5_core 0000:3b:00.3: firmware version: 16.31.1014 [ 3639.293321] mlx5_core 0000:3b:00.2 enp59s0f0v0: Link up [ 3639.300644] IPv6: ADDRCONF(NETDEV_CHANGE): enp59s0f0v0: link becomes ready [ 3639.420981] mlx5_core 0000:3b:00.3: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps [ 3639.441578] mlx5_core 0000:3b:00.3: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0) [ 3639.605954] mlx5_core 0000:3b:00.3: Supported tc offload range - chains: 1, prios: 16 [ 3639.613799] mlx5_core 0000:3b:00.3: mlx5_tc_ct_init:2146:(pid 78892): tc ct offload not supported, firmware level support is missing [ 3639.631930] mlx5_core 0000:3b:00.3 enp59s0f0v1: renamed from eth0 [ 3639.764667] mlx5_core 0000:3b:00.3 enp59s0f0v1: Link up [ 3640.359259] IPv6: ADDRCONF(NETDEV_CHANGE): enp59s0f0v1: link becomes ready [ 3641.786646] mlx5_core 0000:3b:00.0: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3) [ 3643.328980] mlx5_core 0000:3b:00.0: E-Switch: Supported tc chains and prios offload [ 3643.336660] mlx5_core 0000:3b:00.0: Supported tc offload range - chains: 4294967294, prios: 4294967295 [ 3643.751797] mlx5_core 0000:3b:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0) [ 3652.014707] restraintd[3965]: *** Current Time: Thu Oct 28 10:09:52 2021 Localwatchdog at: Fri Oct 29 09:10:51 2021 [-- MARK -- Thu Oct 28 14:10:00 2021] [-- MARK -- Thu Oct 28 14:10:01 2021] [ 3712.014248] restraintd[3965]: *** Current Time: Thu Oct 28 10:10:52 2021 Localwatchdog at: Fri Oct 29 09:10:51 2021 [ 3772.014427] restraintd[3965]: *** Current Time: Thu Oct 28 10:11:52 2021 Localwatchdog at: Fri Oct 29 09:10:51 2021 [ 3811.928855] INFO: task kworker/u96:4:38872 blocked for more than 122 seconds. [ 3811.936000] Tainted: G I --------- --- 5.14.0-1.6.1.el9.x86_64 #1 [ 3811.943918] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3811.951746] task:kworker/u96:4 state:D stack: 0 pid:38872 ppid: 2 flags:0x00004000 [ 3811.960098] Workqueue: netns cleanup_net [ 3811.964030] Call Trace: [ 3811.966485] __schedule+0x206/0x550 [ 3811.969986] schedule+0x3c/0xa0 [ 3811.973139] schedule_preempt_disabled+0xa/0x10 [ 3811.977681] __mutex_lock.constprop.0+0x295/0x450 [ 3811.982394] ? idr_for_each+0x95/0xd0 [ 3811.986069] devlink_pernet_pre_exit+0x2a/0xc0 [ 3811.990525] cleanup_net+0x1d2/0x370 [ 3811.994111] process_one_work+0x1e3/0x380 [ 3811.998131] worker_thread+0x53/0x3d0 [ 3812.001796] ? process_one_work+0x380/0x380 [ 3812.005999] kthread+0x10c/0x130 [ 3812.009233] ? set_kthread_struct+0x40/0x40 [ 3812.013417] ret_from_fork+0x1f/0x30 [ 3812.017014] INFO: task devlink:90062 blocked for more than 122 seconds. [ 3812.023626] Tainted: G I --------- --- 5.14.0-1.6.1.el9.x86_64 #1 [ 3812.031536] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3812.039363] task:devlink state:D stack: 0 pid:90062 ppid: 16460 flags:0x00004000 [ 3812.047707] Call Trace: [ 3812.050160] __schedule+0x206/0x550 [ 3812.053654] schedule+0x3c/0xa0 [ 3812.056809] rwsem_down_write_slowpath+0x224/0x470 [ 3812.061610] register_netdevice_notifier+0x1c/0x110 [ 3812.066505] mlx5e_rep_bridge_init+0x111/0x130 [mlx5_core] [ 3812.072052] mlx5e_uplink_rep_enable+0xd4/0x140 [mlx5_core] [ 3812.077668] mlx5e_attach_netdev+0x9e/0x140 [mlx5_core] [ 3812.082927] ? mlx5e_init_ul_rep+0x3e/0x50 [mlx5_core] [ 3812.088100] mlx5e_netdev_attach_profile+0x93/0xb0 [mlx5_core] [ 3812.093967] mlx5e_netdev_change_profile+0xa0/0xc0 [mlx5_core] [ 3812.099835] mlx5e_vport_rep_load+0xa0/0xf0 [mlx5_core] [ 3812.105095] mlx5_esw_offloads_rep_load+0x86/0xe0 [mlx5_core] [ 3812.110884] esw_offloads_enable+0x266/0x370 [mlx5_core] [ 3812.116229] mlx5_eswitch_enable_locked.part.0+0x100/0x310 [mlx5_core] [ 3812.122792] esw_offloads_start+0x44/0x1f0 [mlx5_core] [ 3812.127972] ? __nla_validate_parse+0x136/0x180 [ 3812.132504] mlx5_devlink_eswitch_mode_set+0x102/0x180 [mlx5_core] [ 3812.138718] devlink_nl_cmd_eswitch_set_doit+0xc1/0x150 [ 3812.143952] genl_family_rcv_msg_doit+0xe7/0x150 [ 3812.148574] genl_rcv_msg+0xdc/0x1e0 [ 3812.152160] ? __devlink_port_phys_port_name_get+0x1e0/0x1e0 [ 3812.157817] ? genl_get_cmd+0xd0/0xd0 [ 3812.161483] netlink_rcv_skb+0x4e/0xf0 [ 3812.165236] genl_rcv+0x24/0x40 [ 3812.168381] netlink_unicast+0x1f6/0x2c0 [ 3812.172307] netlink_sendmsg+0x23b/0x480 [ 3812.176231] sock_sendmsg+0x5b/0x60 [ 3812.179726] __sys_sendto+0xf0/0x160 [ 3812.183305] ? handle_mm_fault+0xba/0x280 [ 3812.187324] ? do_user_addr_fault+0x1c7/0x660 [ 3812.191683] __x64_sys_sendto+0x20/0x30 [ 3812.195524] do_syscall_64+0x38/0x90 [ 3812.199101] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3812.204153] RIP: 0033:0x7f718733059a [ 3812.207734] RSP: 002b:00007ffdef8570b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 3812.215297] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f718733059a [ 3812.222431] RDX: 0000000000000038 RSI: 000055eedd7ff440 RDI: 0000000000000003 [ 3812.229563] RBP: 0000000000000000 R08: 00007f7187435200 R09: 000000000000000c [ 3812.236694] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3812.243828] R13: 000055eedd7ff2a0 R14: 000055eedc986d5c R15: 000055eedd7ff440 [ 3832.014278] restraintd[3965]: *** Current Time: Thu Oct 28 10:12:52 2021 Localwatchdog at: Fri Oct 29 09:10:51 2021 [ 3892.014629] restraintd[3965]: *** Current Time: Thu Oct 28 10:13:52 2021 Localwatchdog at: Fri Oct 29 09:10:51 2021 beaker job: https://beaker.engineering.redhat.com/jobs/5950116 distro: RHEL-9.0.0-20211020.4 kernel-5.14.0-1.6.1.el9.x86_64 openvswitch2.15-2.15.0-20.el9fdp.x86_64 Additional info:
Created attachment 1838183 [details] console.log
should be fixed in RHEL9.2 GA kernel, please test it with the new kernel 5.14.0.284.11.1.el9