Bug 576156

Summary:	INFO: possible circular locking dependency detected - 2.6.33-1.fc13.i686 - mdadm/3174 is trying to acquire lock
Product:	[Fedora] Fedora	Reporter:	James Laska <jlaska>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	low
Version:	13	CC:	anton, dougsland, gansalmon, itamar, jonathan, jturner, kernel-maint, matti.aarnio
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-06-27 15:15:02 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	507684

Description James Laska 2010-03-23 12:43:44 UTC

Description of problem:

While installing F-13-Beta-i386-TC1 in a RAID configuration, a kernel locking dep message appears on the console.

Version-Release number of selected component (if applicable):

 * anaconda-13.36
 * kernel-2.6.33-1.fc13.i686
 * mdadm-3.0.3-3.fc13.i686

How reproducible:

 * only saw this on the i686 install


Steps to Reproduce:
1. Follow test instructions at https://fedoraproject.org/wiki/QA/TestCases/PartitioningRootfsOnRaid1

Using a 2 disk configuration:
 vda1 - 500M - /boot
 vda2 - 4G   - RAID member
 vda3 - ~4G  - RAID member
 vdb1 - 2G   - swap
  
Actual results:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.33-1.fc13.i686 #1
-------------------------------------------------------
mdadm/3174 is trying to acquire lock:
 (&bdev->bd_mutex){+.+.+.}, at: [<c050139f>] __blkdev_get+0x7e/0x32d

but task is already holding lock:
 (&new->reconfig_mutex){+.+.+.}, at: [<c06eaae7>] md_ioctl+0xb9/0xf05

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&new->reconfig_mutex){+.+.+.}:
       [<c046390b>] __lock_acquire+0xa23/0xb89
       [<c0463b04>] lock_acquire+0x93/0xb1
       [<c07b1b55>] __mutex_lock_common+0x32/0x30a
       [<c07b1e62>] mutex_lock_interruptible_nested+0x35/0x3c
       [<c06e4111>] md_attr_show+0x2e/0x55
       [<c0524e11>] sysfs_read_file+0x9c/0x13f
       [<c04e02ef>] vfs_read+0x82/0xe1
       [<c04e03ec>] sys_read+0x40/0x62
       [<c07b343c>] syscall_call+0x7/0xb

-> #1 (s_active){++++.+}:
       [<c046390b>] __lock_acquire+0xa23/0xb89
       [<c0463b04>] lock_acquire+0x93/0xb1
       [<c0525c1c>] sysfs_addrm_finish+0x9f/0x117
       [<c0525cc0>] remove_dir+0x2c/0x32
       [<c0525d5a>] sysfs_remove_dir+0x85/0xa0
       [<c05bce80>] kobject_del+0xf/0x2c
       [<c05bcf5c>] kobject_release+0xbf/0x1b0
       [<c05bdd7d>] kref_put+0x39/0x42
       [<c05bce19>] kobject_put+0x37/0x3c
       [<c0521b0b>] delete_partition+0x41/0x5e
       [<c052207b>] rescan_partitions+0x59/0x3ba
       [<c05b3373>] blkdev_ioctl+0x5a9/0x692
       [<c05004f0>] block_ioctl+0x35/0x3d
       [<c04eaf99>] vfs_ioctl+0x2c/0x96
       [<c04eb54c>] do_vfs_ioctl+0x49b/0x4d9
       [<c04eb5d0>] sys_ioctl+0x46/0x66
       [<c07b343c>] syscall_call+0x7/0xb

-> #0 (&bdev->bd_mutex){+.+.+.}:
       [<c046380d>] __lock_acquire+0x925/0xb89
       [<c0463b04>] lock_acquire+0x93/0xb1
       [<c07b1b55>] __mutex_lock_common+0x32/0x30a
       [<c07b1eda>] mutex_lock_nested+0x35/0x3d
       [<c050139f>] __blkdev_get+0x7e/0x32d
       [<c050165d>] blkdev_get+0xf/0x11
       [<c0501786>] open_by_devnum+0x29/0x35
       [<c06e4f2a>] lock_rdev+0x2c/0xb8
       [<c06e5085>] md_import_device+0xcf/0x26a
       [<c06e94d0>] add_new_disk+0x5f/0x3cf
       [<c06eb402>] md_ioctl+0x9d4/0xf05
       [<c05b2a95>] __blkdev_driver_ioctl+0x35/0x8f
       [<c05b342d>] blkdev_ioctl+0x663/0x692
       [<c05004f0>] block_ioctl+0x35/0x3d
       [<c04eaf99>] vfs_ioctl+0x2c/0x96
       [<c04eb54c>] do_vfs_ioctl+0x49b/0x4d9
       [<c04eb5d0>] sys_ioctl+0x46/0x66
       [<c07b343c>] syscall_call+0x7/0xb

other info that might help us debug this:

1 lock held by mdadm/3174:
 #0:  (&new->reconfig_mutex){+.+.+.}, at: [<c06eaae7>] md_ioctl+0xb9/0xf05

stack backtrace:
Pid: 3174, comm: mdadm Not tainted 2.6.33-1.fc13.i686 #1
Call Trace:
 [<c07b08a3>] ? printk+0x14/0x19
 [<c0462bb3>] print_circular_bug+0x91/0x9d
 [<c046380d>] __lock_acquire+0x925/0xb89
 [<c0463b04>] lock_acquire+0x93/0xb1
 [<c050139f>] ? __blkdev_get+0x7e/0x32d
 [<c07b1b55>] __mutex_lock_common+0x32/0x30a
 [<c050139f>] ? __blkdev_get+0x7e/0x32d
 [<c05b34f7>] ? exact_match+0x0/0xd
 [<c07b1eda>] mutex_lock_nested+0x35/0x3d
 [<c050139f>] ? __blkdev_get+0x7e/0x32d
 [<c050139f>] __blkdev_get+0x7e/0x32d
 [<c050027b>] ? bdev_test+0x0/0x18
 [<c050165d>] blkdev_get+0xf/0x11
 [<c0501786>] open_by_devnum+0x29/0x35
 [<c06e4f2a>] lock_rdev+0x2c/0xb8
 [<c06e4fb6>] ? md_import_device+0x0/0x26a
 [<c06e5085>] md_import_device+0xcf/0x26a
 [<c04612b9>] ? lock_release_holdtime+0x31/0xd6
 [<c04641f6>] ? lock_release_non_nested+0xb5/0x1e8
 [<c06e94d0>] add_new_disk+0x5f/0x3cf
 [<c04c0b9c>] ? might_fault+0x4c/0x86
 [<c04c0b9c>] ? might_fault+0x4c/0x86
 [<c04c0bd1>] ? might_fault+0x81/0x86
 [<c05c372f>] ? _copy_from_user+0x36/0x119
 [<c06eb402>] md_ioctl+0x9d4/0xf05
 [<c058a96c>] ? avc_has_perm_noaudit+0x359/0x364
 [<c058a9b8>] ? avc_has_perm+0x41/0x4b
 [<c058ddb0>] ? inode_has_perm+0x8b/0xa6
 [<c06eaa2e>] ? md_ioctl+0x0/0xf05
 [<c05b2a95>] __blkdev_driver_ioctl+0x35/0x8f
 [<c05b342d>] blkdev_ioctl+0x663/0x692
 [<c04d5111>] ? check_valid_pointer+0x21/0x4d
 [<c04d5918>] ? check_object+0x132/0x166
 [<c05004f0>] block_ioctl+0x35/0x3d
 [<c058df46>] ? file_has_perm+0x8f/0xa9
 [<c05004f0>] ? block_ioctl+0x35/0x3d
 [<c04eaf99>] vfs_ioctl+0x2c/0x96
 [<c05004bb>] ? block_ioctl+0x0/0x3d
 [<c04eb54c>] do_vfs_ioctl+0x49b/0x4d9
 [<c058e1ea>] ? selinux_file_ioctl+0x43/0x46
 [<c04eb5d0>] sys_ioctl+0x46/0x66
 [<c07b343c>] syscall_call+0x7/0xb
 [<c07b0000>] ? acpi_processor_add+0x492/0x8db
md: bind<vda2>
md: bind<vda3>
md0: WARNING: vda2 appears to be on the same physical disk as vda3.
True protection against single-disk failure might be compromised.
raid1: md0 is not clean -- starting background reconstruction
raid1: raid set md0 active with 2 out of 2 mirrors
md0: bitmap initialized from disk: read 1/1 pages, set 28000 bits
created bitmap (14 pages) for device md0
md0: detected capacity change from 0 to 7340023808
md: resync of RAID array md0
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 128k window, over a total of 7167992 blocks.
 md0: unknown partition table
Adding 2097112k swap on /dev/vdb1.  Priority:-1 extents:1 across:2097112k 
EXT4-fs (md0): mounted filesystem with ordered data mode
SELinux: initialized (dev md0, type ext4), uses xattr
EXT4-fs (vda1): mounted filesystem with ordered data mode
SELinux: initialized (dev vda1, type ext4), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
JBD: barrier-based sync failed on vda1-8 - disabling barriers
JBD: barrier-based sync failed on md0-8 - disabling barriers
SELinux: initialized (dev 0:16, type nfs4), uses genfs_contexts
mount.nfs used greatest stack depth: 4628 bytes left
ISO 9660 Extensions: Microsoft Joliet Level 3
ISO 9660 Extensions: RRIP_1991A
SELinux: initialized (dev loop2, type iso9660), uses genfs_contexts
SELinux: initialized (dev 0:16, type nfs4), uses genfs_contexts
ISO 9660 Extensions: Microsoft Joliet Level 3
ISO 9660 Extensions: RRIP_1991A
SELinux: initialized (dev loop2, type iso9660), uses genfs_contexts
ISO 9660 Extensions: Microsoft Joliet Level 3
ISO 9660 Extensions: RRIP_1991A
SELinux: initialized (dev loop1, type iso9660), uses genfs_contexts
SELinux: 8192 avtab hash slots, 161894 rules.
SELinux: 8192 avtab hash slots, 161894 rules.
SELinux:  8 users, 12 roles, 3108 types, 154 bools, 1 sens, 1024 cats
SELinux:  77 classes, 161894 rules


Expected results:


Additional info:

Comment 1 Chuck Ebbert 2010-03-30 23:06:30 UTC

*** Bug 558230 has been marked as a duplicate of this bug. ***

Comment 2 Chuck Ebbert 2010-03-30 23:12:03 UTC

Looks like this upstream commit is causing some false positives:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=846f99749ab68bbc7f75c74fec305de675b1a1bf

The fix is here:

http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=6992f5334995af474c2b58d010d08bc597f0f2fe

I don't think this needs to be a blocker bug.

Comment 3 Chuck Ebbert 2010-03-30 23:16:16 UTC

Also needs:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a2db6842873c8e5a70652f278d469128cb52db70

Comment 4 James Laska 2010-04-16 19:39:26 UTC

Thanks for the feedback Chuck.  Agreed at the F13Blocker meeting to move this issue to F13Target (nice to have) since this doesn't prevent RAID installations and you've already identified the upstream false positives that might be triggering this.  This may be added to Common_F13_Bugs later should more reports surface.

Comment 5 James Laska 2010-04-30 12:33:02 UTC

FYI, seeing this with F-13-Final-TC1.  I know this isn't supposed to be fixed.  There was some discussion during blocker meetings that the 'possible circular locking dep' messages would go away when debugging was disabled in the kernel.  Just posting here so folks know these messages still appear.

=======================================================           │             
[ INFO: possible circular locking dependency detected ]           │             
2.6.33.2-57.fc13.x86_64 #1────────────────────────────────────────┘             
-------------------------------------------------------
mdadm/1042 is trying to acquire lock:
 (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff81147072>] __blkdev_get+0x91/0x3a8

Comment 6 James Laska 2010-05-24 20:15:57 UTC

Removing from CommonBugs since these messages no longer appear with F-13-RC3

Comment 7 Bug Zapper 2011-06-02 16:00:16 UTC

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Bug Zapper 2011-06-27 15:15:02 UTC

Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.