From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050427 Red Hat/1.7.7-1.1.3.4 Description of problem: GFS deadlock due to the following lock sequence: (vfs) do_truncate { down(.... i_sem); down_write(.... i_alloc_sem); err = notify_change(dentry, &newattrs); -> call gfs_setattr -> glock up_write(.... i_alloc_sem); up(.... i_sem); } (gfs) do_write_direct { glock down(.... i_sem) __blockdev_direct_IO -> down(.... i_alloc_sem) -> up(.... i_alloc_sem) up(.... i_sem) } (gfs) gfs_read { glock _blockdev_direct_IO -> down(.... i_sem) -> up(.... i_sem) } Version-Release number of selected component (if applicable): 2.6.9-22.ELsmp How reproducible: Sometimes Steps to Reproduce: 1. (Running Oracle performance stress test) 2. Havn't tried using simpler test case. Actual Results: Oracle process hang. Additional info:
Bugzilla 173912 - two GFS deadlocks during Oracle performance runs due to lock order enforced by VFS layer where: 1) In sys_ftruncate()->do_truncate(), VFS layer grabs * i_sem * then i_alloc_sem * then call filesystem's setattr(). 2) In Direct IO read patch, VFS layer calls * filesystem's direct_IO() * then grabs i_sem * followed by i_alloc_sem. The above lock sequence would undoubtly cause deadlocks for external kernel module that requires extra locking for various purposes. In GFS case, both gfs_setattr() and gfs_direct_IO() need its own (global) locks to synchronize inter-nodes (and inter-processes) control structures access. Two deadlocks (bugzilla 171488, 173913 - bugzilla 173912 is for base kernel build) have been found so far. To work around this issue, a test kernel which forces GFS to take the two inode semaphores in the earlier read code path was sent out before Thanksgiving break. It reported back with successful results. The new RPMs require a base kernel change (that allows GFS to bypass the inode semaphore acquiring within __blockdev_direct_IO) - it basically does the following: (vfs) sys_ftruncate()->do_truncate() { down(i_sem); down_write(i_alloc_sem); notify_change(); -> call gfs_setattr -> g-lock ->g-unlock up_write(i_alloc_sem); up(i_sem); } (gfs) gfs_write (O_DIRECT) { down(i_sem) down(i_alloc_sem) g-lock __blockdev_direct_IO -> new DIO_GFS_LOCKING { don't get the inode semaphores } g-unlock up(i_alloc_sem) up(i_sem) } (gfs) gfs_read (O_DIRECT) { down(i_sem) down(i_alloc_sem) g-lock _blockdev_direct_IO -> new DIO_GFS_LOCKING { don't get the inode sempahore } g-unlock up(i_alloc_sem) up(i_sem) } Would appreciate if this patch could make into U3 - comment ?
This bugzilla was opened with an expectation that we needed to change base kernel to workaround the above deadlock. However, after further investigation, we've found if GFS switches from DIO_LOCKING to DIO_NO_LOCKING (that was currently used by raw device), we can workaround the deadlock. So I'm closing this as an non-bug. The GFS changes will use bugzilla 173913.
Unfortunately, we have to come back to this issue. We can't work around with DIO_NO_LOCKING. The request has been sent to kernel group for review. Waiting for the response.
Created attachment 121777 [details] gfs_kernel_i_alloc.patch New patch submitted to kernel group on 12/02/05.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0132.html