Hide Forgot
Description of problem: https://lore.kernel.org/linux-block/20210811142624.618598-1-ming.lei@redhat.com/T/#u blk-mq: fix kernel panic during iterating over flush request For fixing use-after-free during iterating over requests, we grabbed request's refcount before calling ->fn in commit 2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter"). Turns out this way may cause kernel panic when iterating over one flush request: 1) old flush request's tag is just released, and this tag is reused by one new request, but ->rqs[] isn't updated yet 2) the flush request can be re-used for submitting one new flush command, so blk_rq_init() is called at the same time 3) meantime blk_mq_queue_tag_busy_iter() is called, and old flush request is retrieved from ->rqs[tag]; when blk_mq_put_rq_ref() is called, flush_rq->end_io may not be updated yet, so NULL pointer dereference is triggered in blk_mq_put_rq_ref(). Fix the issue by calling refcount_set(&flush_rq->ref, 1) after flush_rq->end_io is set. So far the only other caller of blk_rq_init() is scsi_ioctl_reset() in which the request doesn't enter block IO stack and the request reference count isn't used, so the change is safe. Fixes: 2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter") Reported-by: "Blank-Burian, Markus, Dr." <blankburian> Tested-by: "Blank-Burian, Markus, Dr." <blankburian> Signed-off-by: Ming Lei <ming.lei> 2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter") has been merged to rhel8.5 Version-Release number of selected component (if applicable): How reproducible: About one time after running some container workloads for 30 minutes Steps to Reproduce: N/A Actual results: kernel panic when running some container workloads via openstack Expected results: No kernel panic and the workloads can be run successfully Additional info:
sanity test passed with kernel-4.18.0-340.el8: https://beaker.engineering.redhat.com/jobs/5775992 https://beaker.engineering.redhat.com/jobs/5776549 can not reproduce this issue in blktests test all patches has included to kernel tree: $ git log kernel-4.18.0-340.el8 --oneline --grep=1992700 7e74656663d7 Merge: blk-mq: fix kernel panic when iterating over flush request fd9ee21126cb blk-mq: fix is_flush_rq 964bb31688ac blk-mq: fix kernel panic during iterating over flush request Move to verified + sanityonly
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4356