The dasd driver triggered a kernel warning in blk_mq. The full syslog is at: https://bugzilla.redhat.com/attachment.cgi?id=1392197 referenced from: https://bugzilla.redhat.com/show_bug.cgi?id=1538550#c3 +++ This bug was initially created as a clone of Bug #1538550 +++ --- Additional comment from Vendula Poncova on 2018-02-06 17:09 CET --- I am able to reproduce the bug, but I am not sure what is the cause. However, there are some errors in the attached syslog: [...] 15:24:37,332 WARNING kernel:Call Trace: 15:24:37,332 WARNING kernel:([<040000007cb6fc30>] 0x40000007cb6fc30) 15:24:37,332 WARNING kernel: [<00000000004f7664>] blk_mq_run_hw_queue+0x4c/0xb0 15:24:37,332 WARNING kernel: [<00000000004f7730>] blk_mq_run_hw_queues+0x68/0x88 15:24:37,332 WARNING kernel: [<00000000001e0610>] flush_smp_call_function_queue+0xc0/0x160 15:24:37,332 WARNING kernel: [<0000000000116830>] smp_handle_ext_call+0xb8/0xc0 15:24:37,332 WARNING kernel: [<000000000010bc74>] do_ext_interrupt+0xbc/0xe0 15:24:37,332 WARNING kernel: [<00000000001b222c>] __handle_irq_event_percpu+0x64/0x240 15:24:37,332 WARNING kernel: [<00000000001b2446>] handle_irq_event_percpu+0x3e/0x88 15:24:37,332 WARNING kernel: [<00000000001b7154>] handle_percpu_irq+0x6c/0x98 15:24:37,332 WARNING kernel: [<00000000001b11e2>] generic_handle_irq+0x42/0x60 15:24:37,332 WARNING kernel: [<000000000010c08c>] do_IRQ+0x84/0xb8 15:24:37,332 WARNING kernel: [<00000000008488d4>] ext_int_handler+0x100/0x104 15:24:37,332 WARNING kernel: [<0000000000102be6>] enabled_wait+0x4e/0xe0 15:24:37,332 WARNING kernel:([<0000000000000000>] (null)) 15:24:37,332 WARNING kernel: [<0000000000102f22>] arch_cpu_idle+0x32/0x48 15:24:37,332 WARNING kernel: [<00000000001942f8>] do_idle+0xe8/0x170 15:24:37,332 WARNING kernel: [<0000000000194556>] cpu_startup_entry+0x3e/0x48 15:24:37,332 WARNING kernel: [<0000000000116760>] smp_start_secondary+0x120/0x138 15:24:37,332 WARNING kernel: [<0000000000848cd6>] restart_int_handler+0x62/0x78 15:24:37,332 WARNING kernel: [<0000000000000000>] (null) 15:24:37,332 WARNING kernel:Last Breaking-Event-Address: 15:24:37,332 WARNING kernel: [<00000000004f75e4>] __blk_mq_delay_run_hw_queue+0xbc/0xd8 [...] --- Additional comment from Michal Schmidt on 2018-02-23 17:06:31 CET --- [...] The kernel warning looks more important. It's likely a bug in the dasd driver. The warning is this one: static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async, unsigned long msecs) { >>>> if (WARN_ON_ONCE(!blk_mq_hw_queue_mapped(hctx))) return; dasd is not expressly implicated in the call trace, but it's possible for the flow to go like this: __dasd_cleanup_cqr blk_mq_complete_request __blk_mq_complete_request rq->csd.func = __blk_mq_complete_request_remote; smp_call_function_single_async(ctx->cpu, &rq->csd); Then a CPU gets the IPI and handles it: flush_smp_call_function_queue __blk_mq_complete_request_remote rq->q->softirq_done_fn(rq); // tail call. softirq_done_fn was dasd_request_done. dasd_request_done // tail call blk_mq_run_hw_queues The above only explains why the call trace looks the way it does. It does not explain the actual bug. commit 673514aff5d797b0cdb3f09097ae0e253b5667c3 ("s390/dasd: fix race during dasd initialization") supposedly fixed a similar bug before (already in v4.14).
the original report was on kernel-4.15.0-1.fc28.s390x, but I see the same traceback in kernel-4.16.0-0.rc2.git0.1.fc28.s390x too [ 3.353901] WARNING: CPU: 1 PID: 0 at block/blk-mq.c:1400 __blk_mq_delay_run_hw_queue+0x32/0xd8 [ 3.353926] Modules linked in: sha256_s390 sha_common dasd_eckd_mod dasd_mod [ 3.353957] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-0.rc2.git0.1.fc28.s390x #1 [ 3.353976] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 3.353979] Krnl PSW : 00000000fede11d5 00000000a37d6269 (__blk_mq_delay_run_hw_queue+0x32/0xd8) [ 3.353987] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 3.353991] Krnl GPRS: 0000000077643710 0000000000000000 00000000018d0000 0000000000000001 [ 3.353994] 0000000000000000 00000000795fcab4 0000000001d13e00 0000000001d13eb0 [ 3.353998] 000000007cb2bda8 000000007cb2bda8 0000000000000001 0000000000000000 [ 3.354001] 00000000018d0000 00000000772b1e98 000000007cb2bc20 000000007cb2bbe8 [ 3.354021] Krnl Code: 00000000004f89cc: e31021b80012 lt %r1,440(%r2) 00000000004f89d2: a774000a brc 7,4f89e6 #00000000004f89d6: a7f40001 brc 15,4f89d8 >00000000004f89da: ebaff0a00004 lmg %r10,%r15,160(%r15) 00000000004f89e0: 07fe bcr 15,%r14 00000000004f89e2: 47000000 bc 0,0 00000000004f89e6: b90400c2 lgr %r12,%r2 00000000004f89ea: e31021f00002 ltg %r1,496(%r2) [ 3.354058] Call Trace: [ 3.354062] ([<000000000000006e>] 0x6e) [ 3.354066] [<00000000004f8b62>] blk_mq_run_hw_queue+0x82/0x158 [ 3.354071] [<00000000004f8ca0>] blk_mq_run_hw_queues+0x68/0x88 [ 3.354076] [<00000000004f7ece>] __blk_mq_complete_request+0x10e/0x1d8 [ 3.354081] [<00000000004f8022>] blk_mq_complete_request+0x8a/0xc0 [ 3.354097] [<000003ff80031c64>] dasd_block_tasklet+0x14c/0x4a0 [dasd_mod] [ 3.354104] [<0000000000149852>] tasklet_hi_action+0x92/0x118 [ 3.354111] [<0000000000860da4>] __do_softirq+0xf4/0x340 [ 3.354115] [<00000000001492f2>] irq_exit+0xba/0xc8 [ 3.354121] [<000000000010bf98>] do_IRQ+0x80/0xb8 [ 3.354125] [<00000000008600d4>] io_int_handler+0x130/0x280 [ 3.354128] [<0000000000102c2e>] enabled_wait+0x4e/0xd8 [ 3.354131] ([<0000000000000000>] (null)) [ 3.354135] [<0000000000102fa2>] arch_cpu_idle+0x42/0x58 [ 3.354141] [<000000000019334a>] do_idle+0x15a/0x178 [ 3.354146] [<0000000000193546>] cpu_startup_entry+0x3e/0x48 [ 3.354151] [<0000000000116940>] smp_start_secondary+0x120/0x138 [ 3.354155] [<00000000008607ae>] restart_int_handler+0x62/0x78 [ 3.354158] [<0000000000000000>] (null) [ 3.354160] Last Breaking-Event-Address: [ 3.354165] [<00000000004f89d6>] __blk_mq_delay_run_hw_queue+0x2e/0xd8 [ 3.354167] ---[ end trace a3344d0b9198d3e7 ]---
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs. Fedora 28 has now been rebased to 4.17.7-200.fc28. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 5 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.
For the record - I think it went away in 4.17 kernels. I saw it in <=4.16.