156412 – kernel dm-mpath: Panic due to bio left queued to multipath work queue when multipath mapped device is destroyed.

Bug 156412 - kernel dm-mpath: Panic due to bio left queued to multipath work queue when multipath mapped device is destroyed.

Summary: kernel dm-mpath: Panic due to bio left queued to multipath work queue when mu...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	device-mapper-multipath
Sub Component:
Version:	4.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Alasdair Kergon
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-04-29 21:08 UTC by Ed Goggin
Modified:	2010-01-12 02:20 UTC (History)
CC List:	11 users (show)
Fixed In Version:	U2
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-03-08 15:25:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Novell	81682	0	None	None	None	Never

Description Ed Goggin 2005-04-29 21:08:23 UTC

Description of problem:

This problem was occuring for me back in February with a 2.6.10 based
upstream kernel.  Yet, I don't see anything in the RH AS 4 Update 1
or upstream 2.6.12-rc2 kernel which fixes the problem.

Trigger_event() had paniced maybe 3 or 4 times now, with one simple dd
process running while upgrading ucode on an EMC CLARiion.

I think the problem is that when the old table/map for a multipath
mapped device is swapped for a new one by dm_swap_table(), it
is possible that the trigger_event work_struct field of the multipath
structure for the old map is actively queued to a cpu work queue
when table_destroy() sets about to de-allocate the memory for the
old table, the dm_target, and calls the multipath destructor
multipath_dtr() to de-allocate the multipath structure memory.  If
any of these things happen before the work queue entry is serviced,
trigger_event() will refer to stale dm_table, dm_target, or multipath
structure memory.  This problem does not happen with the 
process_queued_ios() event callback because these entries are
flushed and waited for by dm_suspend() before dm_resume() calls
dm_swap_table().

In this case, the panic occurs when trigger_event() de-references
the ti struct dm_target field of the old multipath structure because
the dm_target structure memory for the multipath target was
de-allocated via vfree() and the reference to the ti field of the
multipath structure is to unmapped kernel memory.

Several solutions come to mind, namely (1) using reference counts
for the multipath structure and (2) flushing the default work queue
in the multipath destructor.  I've included a patch for the latter in this
email.  I likely made the patch for a 2.6.10.rcxxx kernel, so the
line numbers wont mean too much.  Also, calling flush_workqueue() of
the kmultipathd workqueue would apply for the code upstream which makes
use of a multipath specific work queue instead of the system default one.


*** dm-mpath.c.orig	2005-02-22 10:11:47.000000000 -0500
--- dm-mpath.c	2005-02-22 12:03:15.900426560 -0500
***************
*** 747,752 ****
--- 747,754 ----
  static void multipath_dtr(struct dm_target *ti)
  {
  	struct multipath *m = (struct multipath *) ti->private;
+ 
+ 	flush_scheduled_work();
  	free_multipath(m);
  }

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Alasdair Kergon 2005-07-01 13:29:14 UTC

Should that be 
  flush_workqueue(kmultipathd)  ?

And this is global across all multipath tables: that's OK for events, but slow
for process_queued_io:  Should we have 2 workqueues (one for events, one for
queued io),  or one workqueue per table?

Comment 2 Lars Marowsky-Bree 2005-07-01 13:51:41 UTC

Yes, it should be flush_workqueue(kmultipathd), that's what the SLES kernel uses.

I don't think it's tim-ecritical enough to have two workqueues.

Comment 3 Alasdair Kergon 2005-07-01 14:28:05 UTC

OK; will stick to one workqueue for now.  Any need for 2 may go away when we
address the problem of losing context when swapping table.

Comment 4 Alasdair Kergon 2005-07-08 19:57:03 UTC

patch submitted to -mm

Comment 6 Heather Conway 2005-08-30 19:26:34 UTC

Has the patch been accepted to the upstream kernel?  If so, will the changes be 
incorporated into RHEL 4.0 U3?  If not, is there an ETA as to when the patch 
might be accepted?
Thanks.
Heather

Comment 7 Alasdair Kergon 2005-08-30 19:50:27 UTC

This fix has made it upstream and should be in U2 (indicated by ON_QA status).

Comment 8 Heather Conway 2005-09-15 15:01:22 UTC

Ed - have you verified that the fix is included in the RHEL 4.0 U2 beta?
Thanks.
Heather

Note You need to log in before you can comment on or make changes to this bug.