Bug 1152813

Summary: lockdep splat: xfs + splice()
Product: [Fedora] Fedora Reporter: Colin Walters <walters>
Component: kernelAssignee: fedora-kernel-xfs
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 22CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-19 12:13:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Colin Walters 2014-10-15 02:31:57 UTC
I was using "atomic upgrade" which calls into libostree.  Notably it uses splice() with a pipe.

[14689.265161] ======================================================
[14689.265175] [ INFO: possible circular locking dependency detected ]
[14689.265186] 3.18.0-0.rc0.git2.1.fc22.x86_64 #1 Not tainted
[14689.265190] -------------------------------------------------------
[14689.265199] atomic/1144 is trying to acquire lock:
[14689.265203]  (&sb->s_type->i_mutex_key#13){+.+.+.}, at: [<ffffffffa01465ba>] xfs_file_buffered_aio_write.isra.10+0x7a/0x310 [xfs]
[14689.265245] 
but task is already holding lock:
[14689.265249]  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8126937e>] pipe_lock+0x1e/0x20
[14689.265262] 
which lock already depends on the new lock.

[14689.265268] 
the existing dependency chain (in reverse order) is:
[14689.265287] 
-> #2 (&pipe->mutex/1){+.+.+.}:
[14689.265296]        [<ffffffff810ffde4>] lock_acquire+0xa4/0x1d0
[14689.265303]        [<ffffffff8183e5b5>] mutex_lock_nested+0x85/0x440
[14689.265310]        [<ffffffff8126937e>] pipe_lock+0x1e/0x20
[14689.265315]        [<ffffffff8129836a>] splice_to_pipe+0x2a/0x260
[14689.265321]        [<ffffffff81298b9f>] __generic_file_splice_read+0x57f/0x620
[14689.265328]        [<ffffffff81298c7b>] generic_file_splice_read+0x3b/0x90
[14689.265334]        [<ffffffffa0145b20>] xfs_file_splice_read+0xb0/0x1e0 [xfs]
[14689.265350]        [<ffffffff812976ac>] do_splice_to+0x6c/0x90
[14689.265356]        [<ffffffff81299e7d>] SyS_splice+0x6dd/0x800
[14689.265362]        [<ffffffff81842f69>] system_call_fastpath+0x16/0x1b
[14689.265368] 
-> #1 (&(&ip->i_iolock)->mr_lock){++++++}:
[14689.265424]        [<ffffffff810ffde4>] lock_acquire+0xa4/0x1d0
[14689.265494]        [<ffffffff810f87be>] down_write_nested+0x5e/0xc0
[14689.265553]        [<ffffffffa0153529>] xfs_ilock+0xb9/0x1c0 [xfs]
[14689.265629]        [<ffffffffa01465c7>] xfs_file_buffered_aio_write.isra.10+0x87/0x310 [xfs]
[14689.265693]        [<ffffffffa01468da>] xfs_file_write_iter+0x8a/0x130 [xfs]
[14689.265749]        [<ffffffff8126019e>] new_sync_write+0x8e/0xd0
[14689.265811]        [<ffffffff81260a3a>] vfs_write+0xba/0x200
[14689.265862]        [<ffffffff812616ac>] SyS_write+0x5c/0xd0
[14689.265912]        [<ffffffff81842f69>] system_call_fastpath+0x16/0x1b
[14689.265963] 
-> #0 (&sb->s_type->i_mutex_key#13){+.+.+.}:
[14689.266024]        [<ffffffff810ff45e>] __lock_acquire+0x1b0e/0x1c10
[14689.266024]        [<ffffffff810ffde4>] lock_acquire+0xa4/0x1d0
[14689.266024]        [<ffffffff8183e5b5>] mutex_lock_nested+0x85/0x440
[14689.266024]        [<ffffffffa01465ba>] xfs_file_buffered_aio_write.isra.10+0x7a/0x310 [xfs]
[14689.266024]        [<ffffffffa01468da>] xfs_file_write_iter+0x8a/0x130 [xfs]
[14689.266024]        [<ffffffff81297ffc>] iter_file_splice_write+0x2ec/0x4b0
[14689.266024]        [<ffffffff81299b21>] SyS_splice+0x381/0x800
[14689.266024]        [<ffffffff81842f69>] system_call_fastpath+0x16/0x1b
[14689.266024] 
other info that might help us debug this:

[14689.266024] Chain exists of:
  &sb->s_type->i_mutex_key#13 --> &(&ip->i_iolock)->mr_lock --> &pipe->mutex/1

[14689.266024]  Possible unsafe locking scenario:

[14689.266024]        CPU0                    CPU1
[14689.266024]        ----                    ----
[14689.266024]   lock(&pipe->mutex/1);
[14689.266024]                                lock(&(&ip->i_iolock)->mr_lock);
[14689.266024]                                lock(&pipe->mutex/1);
[14689.266024]   lock(&sb->s_type->i_mutex_key#13);
[14689.266024] 
 *** DEADLOCK ***

[14689.266024] 2 locks held by atomic/1144:
[14689.266024]  #0:  (sb_writers#8){.+.+.+}, at: [<ffffffff81299f1f>] SyS_splice+0x77f/0x800
[14689.266024]  #1:  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff8126937e>] pipe_lock+0x1e/0x20
[14689.266024] 
stack backtrace:
[14689.266024] CPU: 0 PID: 1144 Comm: atomic Not tainted 3.18.0-0.rc0.git2.1.fc22.x86_64 #1
[14689.266024] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[14689.266024]  0000000000000000 00000000fd91796b ffff88003793bad0 ffffffff81838f3e
[14689.266024]  ffffffff82c03eb0 ffff88003793bb10 ffffffff81836b36 ffff88003793bb70
[14689.266024]  ffff88003828a670 ffff880038289a40 0000000000000002 ffff880038289ab0
[14689.266024] Call Trace:
[14689.266024]  [<ffffffff81838f3e>] dump_stack+0x4d/0x66
[14689.266024]  [<ffffffff81836b36>] print_circular_bug+0x201/0x20f
[14689.266024]  [<ffffffff810ff45e>] __lock_acquire+0x1b0e/0x1c10
[14689.266024]  [<ffffffff8101c8c0>] ? dump_trace+0x170/0x350
[14689.266024]  [<ffffffff810ffde4>] lock_acquire+0xa4/0x1d0
[14689.266024]  [<ffffffffa01465ba>] ? xfs_file_buffered_aio_write.isra.10+0x7a/0x310 [xfs]
[14689.266024]  [<ffffffff8183e5b5>] mutex_lock_nested+0x85/0x440
[14689.266024]  [<ffffffffa01465ba>] ? xfs_file_buffered_aio_write.isra.10+0x7a/0x310 [xfs]
[14689.266024]  [<ffffffffa01465ba>] ? xfs_file_buffered_aio_write.isra.10+0x7a/0x310 [xfs]
[14689.266024]  [<ffffffff810fd36c>] ? mark_held_locks+0x7c/0xb0
[14689.266024]  [<ffffffffa01465ba>] xfs_file_buffered_aio_write.isra.10+0x7a/0x310 [xfs]
[14689.266024]  [<ffffffff8126937e>] ? pipe_lock+0x1e/0x20
[14689.266024]  [<ffffffffa01468da>] xfs_file_write_iter+0x8a/0x130 [xfs]
[14689.266024]  [<ffffffff81297ffc>] iter_file_splice_write+0x2ec/0x4b0
[14689.266024]  [<ffffffff81299b21>] SyS_splice+0x381/0x800
[14689.266024]  [<ffffffff81842f69>] system_call_fastpath+0x16/0x1b

Comment 1 Josh Boyer 2014-10-16 11:54:03 UTC
Reported upstream.

Comment 2 Jaroslav Reznik 2015-03-03 16:22:12 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 3 Fedora End Of Life 2016-07-19 12:13:30 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.