Description of problem: After repeatedly logging in and out to iSCSI target using qedi interfaces, call trace was observed on dmesg: [ 7906.133082] scsi 1:0:2:1: Direct-Access LIO-ORG st86backstore 4.0 PQ: 0 ANSI: 6 [ 7906.143294] scsi 1:0:2:1: alua: supports implicit and explicit TPGS [ 7906.150292] scsi 1:0:2:1: alua: device naa.60014056e1c42009a0c47b4a3bfc7365 port group 0 rel port 1 [ 7906.160502] sd 1:0:2:1: Attached scsi generic sg3 type 0 [ 7906.160774] sd 1:0:2:1: [sdd] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [ 7906.175170] sd 1:0:2:1: [sdd] Write Protect is off [ 7906.180522] sd 1:0:2:1: [sdd] Mode Sense: 43 00 00 08 [ 7906.180918] sd 1:0:2:1: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 7906.191324] sd 1:0:2:1: [sdd] Preferred minimum I/O size 512 bytes [ 7906.198217] sd 1:0:2:1: [sdd] Optimal transfer size 262144 bytes [ 7906.214771] sd 1:0:2:1: [sdd] Attached SCSI disk [ 7911.550804] [0000:02:00.5]:[qedi_offload_work:888]:2: Offload conn TIMEOUT iscsi_cid=1, qedi_ep=0000000002fa705b [ 7917.139541] scsi 2:0:3:1: Direct-Access LIO-ORG st86backstore 4.0 PQ: 0 ANSI: 6 [ 7917.150408] scsi 2:0:3:1: alua: supports implicit and explicit TPGS [ 7917.157408] scsi 2:0:3:1: alua: device naa.60014056e1c42009a0c47b4a3bfc7365 port group 0 rel port 1 [ 7917.167612] sd 2:0:3:1: Attached scsi generic sg4 type 0 [ 7917.167971] sd 2:0:3:1: [sde] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [ 7917.182327] sd 2:0:3:1: [sde] Write Protect is off [ 7917.187677] sd 2:0:3:1: [sde] Mode Sense: 43 00 00 08 [ 7917.188116] sd 2:0:3:1: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 7917.198540] sd 2:0:3:1: [sde] Preferred minimum I/O size 512 bytes [ 7917.205455] sd 2:0:3:1: [sde] Optimal transfer size 262144 bytes [ 7917.223530] sd 2:0:3:1: [sde] Attached SCSI disk [ 7962.361415] device-mapper: multipath: 253:4: Failing path 8:16. [ 7962.370106] [0000:02:00.5]:[qedi_iscsi_event_cb:129]:2: Recv Unknown Event 12 [ 7962.383875] scsi 1:0:2:1: alua: Detached [ 7962.388476] [0000:02:00.4]:[qedi_iscsi_event_cb:129]:1: Recv Unknown Event 12 [ 7962.400898] scsi 1:0:1:1: rejecting I/O to dead device [ 7962.404237] [0000:02:00.5]:[qedi_iscsi_event_cb:129]:2: Recv Unknown Event 12 [ 7962.406673] scsi 1:0:1:1: alua: rtpg failed, result 65536 [ 7962.420240] [0000:02:00.4]:[qedi_iscsi_event_cb:129]:1: Recv Unknown Event 12 [ 7962.420648] naa.60014056e1c42009a0c47b4a3bfc7365: no device found for rtpg [ 7962.436289] device-mapper: multipath: 253:4: Failing path 8:32. [ 7962.442933] device-mapper: multipath: 253:4: Failing path 8:64. [ 7962.449599] I/O error, dev dm-4, sector 41942912 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 [ 7962.459862] I/O error, dev dm-4, sector 41942912 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 [ 7962.469673] Buffer I/O error on dev dm-4, logical block 5242864, async page read [ 7963.511808] scsi 2:0:0:1: alua: Detached [ 7963.516347] device-mapper: multipath: 253:4: Failing path 8:32. [ 7963.522987] device-mapper: multipath: 253:4: Failing path 8:64. [ 7963.550810] scsi 1:0:1:1: alua: Detached [ 7963.555324] device-mapper: multipath: 253:4: Failing path 8:64. [ 7963.562132] device-mapper: uevent: dm_send_uevents: skipping sending uevent for lost device [ 7963.576827] scsi 2:0:3:1: alua: Detached [ 8113.278942] INFO: task systemd-udevd:20742 blocked for more than 122 seconds. [ 8113.286904] Not tainted 5.14.0-284.el9.x86_64 #1 [ 8113.292640] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 8113.301382] task:systemd-udevd state:D stack: 0 pid:20742 ppid: 663 flags:0x00004006 [ 8113.310708] Call Trace: [ 8113.313430] <TASK> [ 8113.315762] __schedule+0x248/0x620 [ 8113.319658] ? radix_tree_node_ctor+0x50/0x50 [ 8113.324524] schedule+0x5a/0xc0 [ 8113.328030] io_schedule+0x42/0x70 [ 8113.331827] folio_wait_bit_common+0x147/0x380 [ 8113.336789] ? file_check_and_advance_wb_err+0xd0/0xd0 [ 8113.342527] truncate_inode_pages_range+0x34e/0x500 [ 8113.347991] blkdev_flush_mapping+0x58/0xc0 [ 8113.352664] blkdev_put_whole+0x3a/0x50 [ 8113.356961] blkdev_put+0x9f/0x1b0 [ 8113.360770] blkdev_close+0x18/0x20 [ 8113.364681] __fput+0x91/0x250 [ 8113.368106] task_work_run+0x59/0x90 [ 8113.372112] do_exit+0x262/0x4a0 [ 8113.375732] do_group_exit+0x2d/0x90 [ 8113.379736] get_signal+0x9d3/0xa00 [ 8113.383647] ? new_sync_read+0xf9/0x180 [ 8113.387953] arch_do_signal_or_restart+0x25/0x100 [ 8113.393206] ? vfs_read+0x13c/0x190 [ 8113.397115] exit_to_user_mode_loop+0x9c/0x130 [ 8113.402093] exit_to_user_mode_prepare+0xb6/0x100 [ 8113.407360] syscall_exit_to_user_mode+0x12/0x30 [ 8113.412531] do_syscall_64+0x69/0x90 [ 8113.416538] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 8113.422194] RIP: 0033:0x7fbcfe53eaf2 [ 8113.426204] RSP: 002b:00007ffc1c42fae8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 8113.434671] RAX: fffffffffffffffc RBX: 000055ace72ba9a8 RCX: 00007fbcfe53eaf2 [ 8113.442652] RDX: 0000000000040000 RSI: 000055ace72ba9b8 RDI: 000000000000000f [ 8113.450631] RBP: 000055ace72084e0 R08: 0000000000000003 R09: 0000000000000000 [ 8113.458612] R10: 0000000000040000 R11: 0000000000000246 R12: 000000027ff80000 [ 8113.466591] R13: 0000000000040000 R14: 000055ace72ba990 R15: 000055ace7208530 [ 8113.474590] </TASK> [ 8236.159046] INFO: task systemd-udevd:20742 blocked for more than 245 seconds. [ 8236.167014] Not tainted 5.14.0-284.el9.x86_64 #1 [ 8236.172744] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 8236.181483] task:systemd-udevd state:D stack: 0 pid:20742 ppid: 663 flags:0x00004006 [ 8236.190809] Call Trace: [ 8236.193540] <TASK> [ 8236.195881] __schedule+0x248/0x620 [ 8236.199777] ? radix_tree_node_ctor+0x50/0x50 [ 8236.204641] schedule+0x5a/0xc0 [ 8236.208146] io_schedule+0x42/0x70 [ 8236.211942] folio_wait_bit_common+0x147/0x380 [ 8236.216894] ? file_check_and_advance_wb_err+0xd0/0xd0 [ 8236.222631] truncate_inode_pages_range+0x34e/0x500 [ 8236.228083] blkdev_flush_mapping+0x58/0xc0 [ 8236.232753] blkdev_put_whole+0x3a/0x50 [ 8236.237034] blkdev_put+0x9f/0x1b0 [ 8236.240831] blkdev_close+0x18/0x20 [ 8236.244725] __fput+0x91/0x250 [ 8236.248127] task_work_run+0x59/0x90 [ 8236.252118] do_exit+0x262/0x4a0 [ 8236.255721] do_group_exit+0x2d/0x90 [ 8236.259704] get_signal+0x9d3/0xa00 [ 8236.263599] ? new_sync_read+0xf9/0x180 [ 8236.267881] arch_do_signal_or_restart+0x25/0x100 [ 8236.273132] ? vfs_read+0x13c/0x190 [ 8236.277026] exit_to_user_mode_loop+0x9c/0x130 [ 8236.281990] exit_to_user_mode_prepare+0xb6/0x100 [ 8236.287242] syscall_exit_to_user_mode+0x12/0x30 [ 8236.292387] do_syscall_64+0x69/0x90 [ 8236.296378] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 8236.302016] RIP: 0033:0x7fbcfe53eaf2 [ 8236.306006] RSP: 002b:00007ffc1c42fae8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 8236.314455] RAX: fffffffffffffffc RBX: 000055ace72ba9a8 RCX: 00007fbcfe53eaf2 [ 8236.322420] RDX: 0000000000040000 RSI: 000055ace72ba9b8 RDI: 000000000000000f [ 8236.330386] RBP: 000055ace72084e0 R08: 0000000000000003 R09: 0000000000000000 [ 8236.338352] R10: 0000000000040000 R11: 0000000000000246 R12: 000000027ff80000 [ 8236.346317] R13: 0000000000040000 R14: 000055ace72ba990 R15: 000055ace7208530 [ 8236.354284] </TASK> Version-Release number of selected component (if applicable): kernel-5.14.0-284.el9.x86_64 device-mapper-multipath-libs-0.8.7-20.el9.x86_64 device-mapper-multipath-0.8.7-20.el9.x86_64 systemd-udev-252-8.el9.x86_64 How reproducible: once Additional info: I wasn't sure what component to open this against, please re-assign as needed
So I suspect it was caused by the iSCSI initiator trying to connect to a target which it had no network path to. After running `iscsiadm -m discovery -t st -p 172.16.0.20 -I qedi0` 172.16.1.20:3260,1 iqn.1994-05.com.redhat:stqe-lio 172.16.0.20:3260,1 iqn.1994-05.com.redhat:stqe-lio `iscsiadm -m discovery -t st -p 172.16.1.20 -I qedi1` 172.16.1.20:3260,1 iqn.1994-05.com.redhat:stqe-lio 172.16.0.20:3260,1 iqn.1994-05.com.redhat:stqe-lio target will provide two portals for each interface, however the 172.16.1 is not reachable by qedi0 and 172.16.0 is not reachable by qedi1. So when user instead of specifying a portal and interface simply runs `iscsiadm -m node -l`, timeouts will arise and this call trace can be reproduced.
Reassigning to iscsi-initiator-utils based on comment #1. Assuming that the multipath device was set to queue_if_no_path, this is exactly what you'd expect if you had no paths to your device. I'm not sure that there's actually a bug here. But if there is, and it further investigation points to multipath, feel free to reassign it back.
Hi, I reported one similar bug on RHEL8.9 with high reproducibility: Bug 2217866 - [RHEL8.9] INFO: task systemd-udevd:2691 blocked for more than 120 seconds with over 256 vcpus, please check.