523422 – Processes hang when accessing container mdraid arrays (mdmon crashed / killed ?)

Bug 523422 - Processes hang when accessing container mdraid arrays (mdmon crashed / killed ?)

Summary: Processes hang when accessing container mdraid arrays (mdmon crashed / killed ?)

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-09-15 12:54 UTC by Hans de Goede
Modified:	2009-09-30 14:11 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2009-09-30 14:11:36 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
lockdep warning log (4.23 KB, text/plain) 2009-09-15 12:54 UTC, Hans de Goede	no flags	Details
Proposed patch: upgrade sysfs_open_dirent_lock to spin_lock_bh (3.43 KB, patch) 2009-09-15 21:37 UTC, Dan Williams	no flags	Details \| Diff
dmesg output of a machine with all processes trying to use the raid sets hanging (67.92 KB, text/plain) 2009-09-16 15:29 UTC, Hans de Goede	no flags	Details
Picture of another mdraid related stuck task (582.54 KB, image/jpeg) 2009-09-17 07:15 UTC, Hans de Goede	no flags	Details
Picture of another mdraid related stuck task (2) (574.06 KB, image/jpeg) 2009-09-17 07:15 UTC, Hans de Goede	no flags	Details
Show Obsolete (1) View All

Description Hans de Goede 2009-09-15 12:54:29 UTC

Created attachment 361076 [details]
lockdep warning log

Description of problem:
Shortly after booting a system using mdraid with external metadata support
to access 2 BIOS RAID 10 sets sharing 4 disks, I get the attached locking inconsistency warning.

This might be related to one of the sets being unclean and needing syncing
(so mdmon is actively syncing the set).

If I do something which causes a signigicant amount of disk IO while
the sync is running, the kernel locks up, and I get hung / stuck process
detected messages every 120 seconds, so this warning seems to be very real.

Let me know if you want me to hook up a serial cable and catch the
stuck process reports.

This is with:
kernel-2.6.31-2.fc12.i686.PAE

And older kernels too.

Comment 1 Dan Williams 2009-09-15 21:37:29 UTC

Created attachment 361145 [details]
Proposed patch: upgrade sysfs_open_dirent_lock to spin_lock_bh

Comment 2 Dan Williams 2009-09-15 21:42:20 UTC

The attached patch simply upgrades the lock.  It makes sysfs_notify_dirent() more useful and is cleaner than adding logic to md to delay the notification to process context.

Comment 3 Dan Williams 2009-09-15 22:11:06 UTC

Neil already sent a fix for this [1] in response to bz#515471.

[1]: http://marc.info/?l=linux-kernel&m=124953744023803&w=2

*** This bug has been marked as a duplicate of bug 515471 ***

Comment 4 Hans de Goede 2009-09-16 15:28:37 UTC

Ok, I've build a kernel with this fix in and the lockdep report is gone, but
I still get deadlocks with my 2 raid10 set setup, I'll attach dmesg output of a machine with all processes trying to use the raid sets hanging.

Re-opening this one to track the deadlock case.

Comment 5 Hans de Goede 2009-09-16 15:29:10 UTC

Created attachment 361300 [details]
dmesg output of a machine with all processes trying to use the raid sets hanging

Comment 6 Dan Williams 2009-09-16 18:39:39 UTC

It looks like the processes are waiting for mdmon to write 'active' to /sys/block/md*/md/array_state.  Is mdmon still running at this point?  Can you dump array_state to confirm that we are stuck at 'write-pending'?  Finally can you say a bit more about what userspace is doing at this point, in the log the arrays are bouncing up and down (starting/stopping)?

Comment 7 Hans de Goede 2009-09-16 21:43:20 UTC

(In reply to comment #6)
> It looks like the processes are waiting for mdmon to write 'active' to
> /sys/block/md*/md/array_state.  Is mdmon still running at this point?  Can you
> dump array_state to confirm that we are stuck at 'write-pending'?

When I hit this again (if I hit this again) I'll be sure to try and gather all
this info.

> Finally can
> you say a bit more about what userspace is doing at this point, in the log the
> arrays are bouncing up and down (starting/stopping)?  

That is correct, the mdraid container code is shared with normal mdraid handling
code in anaconda, and during install everything first gets scanned (so started) and then teared down again, so that for example partitions used as part of a
native mdraid set can be re-purposed to hold a pv or whatever.

So the arrays are stopped / started several times.

Thanks for the hint that this might be mdmon though, this has helped me to fix
a big problem with my test machine no longer booting at all, which was caused by an mdmon segfault inside the initrd, I've written a patch fixing this, see
bug 523860.

Comment 8 Hans de Goede 2009-09-17 07:15:04 UTC

Created attachment 361436 [details]
Picture of another mdraid related stuck task

A slightly different call trace from another stuck mdraid task, this time during
the initrd. Note this initrd still has the crashy mdmon, I was trying to boot
the machine to regenerate the initrd and it hung.

So this could very well be another mdmon no longer running case.

I'll attach another call trace picture from the same boot, which is yet again slightly different.

Comment 9 Hans de Goede 2009-09-17 07:15:43 UTC

Created attachment 361437 [details]
Picture of another mdraid related stuck task (2)

Comment 10 Hans de Goede 2009-09-18 08:24:04 UTC

I seem to no longer be seeing this now I've managed to keep mdmon from crashing.
Adjusting summary.

Comment 11 Hans de Goede 2009-09-30 14:11:36 UTC

Ok, I can no longer reproduce this, with the necessary patches in place to properly handle mdmon handover from initrd to the running system and to not kill mdmon on reboot / halt, closing.

Note You need to log in before you can comment on or make changes to this bug.