Bug 716991 - [xfs] concurrent aio/dio got stuck when writing
[xfs] concurrent aio/dio got stuck when writing
Status: CLOSED DUPLICATE of bug 695827
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.1
x86_64 Linux
medium Severity medium
: rc
: 6.2
Assigned To: Jeffrey Moyer
Filesystem QE
:
Depends On:
Blocks: 718232
  Show dependency treegraph
 
Reported: 2011-06-27 12:59 EDT by Mike Gahagan
Modified: 2013-03-03 19:22 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 718232 (view as bug list)
Environment:
Last Closed: 2011-10-17 09:13:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Add in a missing mutex_unlock in xfs_write (400 bytes, patch)
2011-06-28 16:05 EDT, Jeffrey Moyer
no flags Details | Diff

  None (edit)
Description Mike Gahagan 2011-06-27 12:59:03 EDT
Description of problem:
This appears to be very similar to bz 697750, except that the hang seems to occur when writing rather than reading. The diotest4 test from ltp seems to hang and cannot run to completion.

INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 


Version-Release number of selected component (if applicable):
RHEL 6.1

How reproducible:
frequent if not always.

Steps to Reproduce:
1.schedule a run of /kernel/distribution/ltp/generic in beaker with /mnt/testarea mounted on an xfs filesystem
2. The dio set of testcases is what is causing the failure
3.
  
Actual results:

Tests hang with:

INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:13:51 /usr/bin/rhts-test-runner.sh 14495 1980 hearbeat...  
logger: 2011-06-22 10:14:51 /usr/bin/rhts-test-runner.sh 14495 2040 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
[-- MARK -- Wed Jun 22 10:15:00 2011] 
logger: 2011-06-22 10:16:51 /usr/bin/rhts-test-runner.sh 14495 2160 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:17:51 /usr/bin/rhts-test-runner.sh 14495 2220 hearbeat...  
logger: 2011-06-22 10:18:51 /usr/bin/rhts-test-runner.sh 14495 2280 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:19:51 /usr/bin/rhts-test-runner.sh 14495 2340 hearbeat...  
[-- MARK -- Wed Jun 22 10:20:00 2011] 
logger: 2011-06-22 10:20:51 /usr/bin/rhts-test-runner.sh 14495 2400 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:22:51 /usr/bin/rhts-test-runner.sh 14495 2520 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:23:51 /usr/bin/rhts-test-runner.sh 14495 2580 hearbeat...  
logger: 2011-06-22 10:24:51 /usr/bin/rhts-test-runner.sh 14495 2640 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
[-- MARK -- Wed Jun 22 10:25:00 2011] 
logger: 2011-06-22 10:25:51 /usr/bin/rhts-test-runner.sh 14495 2700 hearbeat...  
logger: 2011-06-22 10:26:51 /usr/bin/rhts-test-runner.sh 14495 2760 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:27:51 /usr/bin/rhts-test-runner.sh 14495 2820 hearbeat...  
logger: 2011-06-22 10:28:51 /usr/bin/rhts-test-runner.sh 14495 2880 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 
logger: 2011-06-22 10:29:51 /usr/bin/rhts-test-runner.sh 14495 2940 hearbeat...  
[-- MARK -- Wed Jun 22 10:30:00 2011] 
logger: 2011-06-22 10:30:52 /usr/bin/rhts-test-runner.sh 14495 3000 hearbeat...  
INFO: task diotest4:75916 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
diotest4      D 0000000000000008     0 75916  75893 0x00000080 
 ffff88082aeefe58 0000000000000082 ffff88082aeefdb8 ffffffffa01f09a1 
 ffff88082aeefee8 ffffffff8117241a ffffffff8100bc8e ffff88082aeefe58 
 ffff8807fb4bf0f8 ffff88082aeeffd8 000000000000f598 ffff8807fb4bf0f8 
Call Trace: 
 [<ffffffffa01f09a1>] ? xfs_file_aio_write+0x61/0x70 [xfs] 
 [<ffffffff8117241a>] ? do_sync_write+0xfa/0x140 
 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 
 [<ffffffff8104b7db>] ? mutex_spin_on_owner+0x9b/0xc0 
 [<ffffffff814dc84e>] __mutex_lock_slowpath+0x13e/0x180 
 [<ffffffff814dc6eb>] mutex_lock+0x2b/0x50 
 [<ffffffff811728d6>] generic_file_llseek+0x36/0x70 
 [<ffffffff811712da>] vfs_llseek+0x3a/0x40 
 [<ffffffff81172a86>] sys_lseek+0x66/0x80 
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b 


Expected results:

Test runs to completion

Additional info:
Comment 2 Jeffrey Moyer 2011-06-28 13:06:41 EDT
Kernel: 2.6.32-131.0.15.el6.x86_64
Comment 3 Jeffrey Moyer 2011-06-28 16:04:16 EDT
The following commit:

commit 43a4125b0dde1d346cad1a177781dc112c52c820
Author: Dave Chinner <dchinner@redhat.com>
Date:   Fri Jan 14 09:30:27 2011 -0500

    [fs] xfs: serialise unaligned direct IOs

introduced this problem.  There is one error case that forgets to drop the i_mutex before returning EINVAL.  Untested patch on the way.
Comment 4 Jeffrey Moyer 2011-06-28 16:05:44 EDT
Created attachment 510356 [details]
Add in a missing mutex_unlock in xfs_write
Comment 9 Jeffrey Moyer 2011-10-17 09:13:49 EDT
This issue was addressed with the patchset in bug 695827, so I'm closing this bugzilla as a duplicate of that one.

*** This bug has been marked as a duplicate of bug 695827 ***

Note You need to log in before you can comment on or make changes to this bug.