Bug 716991

Summary: [xfs] concurrent aio/dio got stuck when writing
Product: Red Hat Enterprise Linux 6 Reporter: Mike Gahagan <mgahagan>
Component: kernelAssignee: Jeff Moyer <jmoyer>
Status: CLOSED DUPLICATE QA Contact: Filesystem QE <fs-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: branto, chellwig, dchinner, eguan, esandeen, jmoyer, kzhang, rwheeler
Target Milestone: rc   
Target Release: 6.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 718232 (view as bug list) Environment:
Last Closed: 2011-10-17 13:13:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 718232    
Attachments:
Description Flags
Add in a missing mutex_unlock in xfs_write none

Description Mike Gahagan 2011-06-27 16:59:03 UTC
Description of problem:
This appears to be very similar to bz 697750, except that the hang seems to occur when writing rather than reading. The diotest4 test from ltp seems to hang and cannot run to completion.

INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 


Version-Release number of selected component (if applicable):
RHEL 6.1

How reproducible:
frequent if not always.

Steps to Reproduce:
1.schedule a run of /kernel/distribution/ltp/generic in beaker with /mnt/testarea mounted on an xfs filesystem
2. The dio set of testcases is what is causing the failure
3.
  
Actual results:

Tests hang with:

INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:13:51 /usr/bin/rhts-test-runner.sh 14495 1980 hearbeat...  
logger: 2011-06-22 10:14:51 /usr/bin/rhts-test-runner.sh 14495 2040 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
[-- MARK -- Wed Jun 22 10:15:00 2011] 
logger: 2011-06-22 10:16:51 /usr/bin/rhts-test-runner.sh 14495 2160 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:17:51 /usr/bin/rhts-test-runner.sh 14495 2220 hearbeat...  
logger: 2011-06-22 10:18:51 /usr/bin/rhts-test-runner.sh 14495 2280 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:19:51 /usr/bin/rhts-test-runner.sh 14495 2340 hearbeat...  
[-- MARK -- Wed Jun 22 10:20:00 2011] 
logger: 2011-06-22 10:20:51 /usr/bin/rhts-test-runner.sh 14495 2400 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:22:51 /usr/bin/rhts-test-runner.sh 14495 2520 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:23:51 /usr/bin/rhts-test-runner.sh 14495 2580 hearbeat...  
logger: 2011-06-22 10:24:51 /usr/bin/rhts-test-runner.sh 14495 2640 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
[-- MARK -- Wed Jun 22 10:25:00 2011] 
logger: 2011-06-22 10:25:51 /usr/bin/rhts-test-runner.sh 14495 2700 hearbeat...  
logger: 2011-06-22 10:26:51 /usr/bin/rhts-test-runner.sh 14495 2760 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:27:51 /usr/bin/rhts-test-runner.sh 14495 2820 hearbeat...  
logger: 2011-06-22 10:28:51 /usr/bin/rhts-test-runner.sh 14495 2880 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:29:51 /usr/bin/rhts-test-runner.sh 14495 2940 hearbeat...  
[-- MARK -- Wed Jun 22 10:30:00 2011] 
logger: 2011-06-22 10:30:52 /usr/bin/rhts-test-runner.sh 14495 3000 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 


Expected results:

Test runs to completion

Additional info:

Comment 2 Jeff Moyer 2011-06-28 17:06:41 UTC
Kernel: 2.6.32-131.0.15.el6.x86_64

Comment 3 Jeff Moyer 2011-06-28 20:04:16 UTC
The following commit:

commit 43a4125b0dde1d346cad1a177781dc112c52c820
Author: Dave Chinner <dchinner>
Date:   Fri Jan 14 09:30:27 2011 -0500

    [fs] xfs: serialise unaligned direct IOs

introduced this problem.  There is one error case that forgets to drop the i_mutex before returning EINVAL.  Untested patch on the way.

Comment 4 Jeff Moyer 2011-06-28 20:05:44 UTC
Created attachment 510356 [details]
Add in a missing mutex_unlock in xfs_write

Comment 9 Jeff Moyer 2011-10-17 13:13:49 UTC
This issue was addressed with the patchset in bug 695827, so I'm closing this bugzilla as a duplicate of that one.

*** This bug has been marked as a duplicate of bug 695827 ***