Red Hat Bugzilla – Bug 156412
kernel dm-mpath: Panic due to bio left queued to multipath work queue when multipath mapped device is destroyed.
Last modified: 2010-01-11 21:20:28 EST
Description of problem:
This problem was occuring for me back in February with a 2.6.10 based
upstream kernel. Yet, I don't see anything in the RH AS 4 Update 1
or upstream 2.6.12-rc2 kernel which fixes the problem.
Trigger_event() had paniced maybe 3 or 4 times now, with one simple dd
process running while upgrading ucode on an EMC CLARiion.
I think the problem is that when the old table/map for a multipath
mapped device is swapped for a new one by dm_swap_table(), it
is possible that the trigger_event work_struct field of the multipath
structure for the old map is actively queued to a cpu work queue
when table_destroy() sets about to de-allocate the memory for the
old table, the dm_target, and calls the multipath destructor
multipath_dtr() to de-allocate the multipath structure memory. If
any of these things happen before the work queue entry is serviced,
trigger_event() will refer to stale dm_table, dm_target, or multipath
structure memory. This problem does not happen with the
process_queued_ios() event callback because these entries are
flushed and waited for by dm_suspend() before dm_resume() calls
In this case, the panic occurs when trigger_event() de-references
the ti struct dm_target field of the old multipath structure because
the dm_target structure memory for the multipath target was
de-allocated via vfree() and the reference to the ti field of the
multipath structure is to unmapped kernel memory.
Several solutions come to mind, namely (1) using reference counts
for the multipath structure and (2) flushing the default work queue
in the multipath destructor. I've included a patch for the latter in this
email. I likely made the patch for a 2.6.10.rcxxx kernel, so the
line numbers wont mean too much. Also, calling flush_workqueue() of
the kmultipathd workqueue would apply for the code upstream which makes
use of a multipath specific work queue instead of the system default one.
*** dm-mpath.c.orig 2005-02-22 10:11:47.000000000 -0500
--- dm-mpath.c 2005-02-22 12:03:15.900426560 -0500
*** 747,752 ****
--- 747,754 ----
static void multipath_dtr(struct dm_target *ti)
struct multipath *m = (struct multipath *) ti->private;
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Should that be
And this is global across all multipath tables: that's OK for events, but slow
for process_queued_io: Should we have 2 workqueues (one for events, one for
queued io), or one workqueue per table?
Yes, it should be flush_workqueue(kmultipathd), that's what the SLES kernel uses.
I don't think it's tim-ecritical enough to have two workqueues.
OK; will stick to one workqueue for now. Any need for 2 may go away when we
address the problem of losing context when swapping table.
patch submitted to -mm
Has the patch been accepted to the upstream kernel? If so, will the changes be
incorporated into RHEL 4.0 U3? If not, is there an ETA as to when the patch
might be accepted?
This fix has made it upstream and should be in U2 (indicated by ON_QA status).
Ed - have you verified that the fix is included in the RHEL 4.0 U2 beta?