Bug 1471203
Summary: | BUG: unable to handle kernel NULL pointer dereference at 0000000000000018; IP: get_request+0x14a | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Scott Mayhew <smayhew> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 25 | CC: | bfields, gansalmon, ichavero, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-12 10:22:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Scott Mayhew
2017-07-14 16:56:42 UTC
crash> bt PID: 11739 TASK: ffff94fe3b512680 CPU: 2 COMMAND: "nfsd" #0 [ffffa091c1a0f8a8] machine_kexec at ffffffffa505873a #1 [ffffa091c1a0f908] __crash_kexec at ffffffffa5139ced #2 [ffffa091c1a0f9d0] __crash_kexec at ffffffffa5139dc5 #3 [ffffa091c1a0f9e8] crash_kexec at ffffffffa5139e0b #4 [ffffa091c1a0fa08] oops_end at ffffffffa502a484 #5 [ffffa091c1a0fa30] no_context at ffffffffa506575f #6 [ffffa091c1a0fa98] __bad_area_nosemaphore at ffffffffa5065a31 #7 [ffffa091c1a0fad8] bad_area_nosemaphore at ffffffffa5065b74 #8 [ffffa091c1a0fae8] __do_page_fault at ffffffffa5065f1e #9 [ffffa091c1a0fb50] trace_do_page_fault at ffffffffa50663f1 #10 [ffffa091c1a0fb88] do_async_page_fault at ffffffffa505fdca #11 [ffffa091c1a0fba0] async_page_fault at ffffffffa5879e28 [exception RIP: get_request+0x14a] RIP: ffffffffa53e374a RSP: ffffa091c1a0fc58 RFLAGS: 00010046 RAX: ffff94fdf6875198 RBX: ffff94fdf6875198 RCX: ffff94fe3b512680 RDX: 0000000000000000 RSI: ffff94fdf68751c0 RDI: ffffffffa635f260 RBP: ffffa091c1a0fd08 R8: ffff94fe3fd1ca80 R9: ffff94fe39fe9300 R10: ffff94fe39fe9300 R11: ffff94fe3b854220 R12: 0000000000000000 R13: 0000000000000020 R14: ffff94fe38266200 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #12 [ffffa091c1a0fd10] blk_get_request at ffffffffa53e48be #13 [ffffa091c1a0fd40] nfsd4_scsi_proc_getdeviceinfo at ffffffffc055e158 [nfsd] #14 [ffffa091c1a0fd78] nfsd4_getdeviceinfo at ffffffffc053e9ec [nfsd] #15 [ffffa091c1a0fdb8] nfsd4_proc_compound at ffffffffc053f7c1 [nfsd] #16 [ffffa091c1a0fe18] nfsd_dispatch at ffffffffc052c338 [nfsd] #17 [ffffa091c1a0fe50] svc_process_common at ffffffffc04c3ec4 [sunrpc] #18 [ffffa091c1a0feb8] svc_process at ffffffffc04c516e [sunrpc] #19 [ffffa091c1a0fee0] nfsd at ffffffffc052bd89 [nfsd] #20 [ffffa091c1a0ff08] kthread at ffffffffa50c5f29 #21 [ffffa091c1a0ff50] ret_from_fork at ffffffffa5878b55 crash> dis -lr get_request+0x14a ... /usr/src/debug/kernel-4.11.fc25/linux-4.11.9-200.fc25.x86_64/block/blk-core.c: 1054 0xffffffffa53e374a <get_request+0x14a>: mov 0x18(%r15),%rax static struct request *__get_request(struct request_list *rl, unsigned int op, struct bio *bio, gfp_t gfp_mask) { struct request_queue *q = rl->q; struct request *rq; struct elevator_type *et = q->elevator->type; <-----HERE ... crash> request_queue.elevator -o struct request_queue { [0x18] struct elevator_queue *elevator; } crash> mount | grep xfs ffff94fe3ca15080 ffff94fe3ca21800 xfs /dev/mapper/fedora-root / crash> super_block.s_bdev ffff94fe3ca21800 s_bdev = 0xffff94fe3b168d00 crash> block_device.bd_disk 0xffff94fe3b168d00 bd_disk = 0xffff94fdf6779800 crash> gendisk.queue 0xffff94fdf6779800 queue = 0xffff94fdf6875160 Call chain for nfsd4_scsi_proc_getdeviceinfo: nfsd4_scsi_proc_getdeviceinfo nfsd4_block_get_device_info_scsi nfsd4_scsi_identify_device blk_get_request struct request *blk_get_request(struct request_queue *q, int rw, gfp_t gfp_mask) { if (q->mq_ops) return blk_mq_alloc_request(q, rw, (gfp_mask & __GFP_DIRECT_RECLAIM) ? 0 : BLK_MQ_REQ_NOWAIT); else return blk_old_get_request(q, rw, gfp_mask); ... crash> request_queue.mq_ops 0xffff94fdf6875160 mq_ops = 0x0 So we call blk_old_get_request which calls: rq = get_request(q, rw, NULL, gfp_mask); static struct request *get_request(struct request_queue *q, unsigned int op, struct bio *bio, gfp_t gfp_mask) { const bool is_sync = op_is_sync(op); DEFINE_WAIT(wait); struct request_list *rl; struct request *rq; rl = blk_get_rl(q, bio); /* transferred to @rq on success */ retry: rq = __get_request(rl, op, bio, gfp_mask); ... static inline struct request_list *blk_get_rl(struct request_queue *q, struct bio *bio) { struct blkcg *blkcg; struct blkcg_gq *blkg; rcu_read_lock(); blkcg = bio_blkcg(bio); ... static inline struct blkcg *bio_blkcg(struct bio *bio) { if (bio && bio->bi_css) return css_to_blkcg(bio->bi_css); return task_blkcg(current); } blk_old_get_request() passed NULL for the bio, so we use task_blkcg(current). static inline struct blkcg *task_blkcg(struct task_struct *tsk) { return css_to_blkcg(task_css(tsk, io_cgrp_id)); } static inline struct cgroup_subsys_state *task_css(struct task_struct *task, int subsys_id) { return task_css_check(task, subsys_id, false); } #define task_css_check(task, subsys_id, __c) \ task_css_set_check((task), (__c))->subsys[(subsys_id)] #define task_css_set_check(task, __c) \ rcu_dereference((task)->cgroups) static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css) { return css ? container_of(css, struct blkcg, css) : NULL; } crash> task -R cgroups ffff94fe3b512680 PID: 11739 TASK: ffff94fe3b512680 CPU: 2 COMMAND: "nfsd" cgroups = 0xffffffffa5e86360, crash> io_cgrp_id enum cgroup_subsys_id = 3 crash> p ((struct css_set *)0xffffffffa5e86360)->subsys[3] $7 = (struct cgroup_subsys_state *) 0xffffffffa635f260 crash> blkcg.css -o struct blkcg { [0x0] struct cgroup_subsys_state css; } So 0xffffffffa635f260 is also the address of blkcg. Continuing on in blk_get_rl() we have: ... if (blkcg == &blkcg_root) goto root_rl; ... root_rl: rcu_read_unlock(); return &q->root_rl; ... crash> p &blkcg_root $8 = (struct blkcg *) 0xffffffffa635f260 crash> px &((struct request_queue *)0xffff94fdf6875160)->root_rl $9 = (struct request_list *) 0xffff94fdf6875198 crash> request_list.q 0xffff94fdf6875198 q = 0x0 and when we tried to access the request_queue->elevator we oopsed. This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |