Red Hat Bugzilla – Bug 138384
posix_locks_deadlock() getting stuck in endless loop
Last modified: 2007-11-30 17:07:05 EST
*** This bug has been split off bug 132540 ***
Description of problem:
posix_locks_deadlock() is getting stuck in an endless loop when
running samba stress. This is because samba is using both flocks and
posix locks, and, when flocks are blocked, they are added to the
blocked_list without first checking for possible deadlocks with the
function posix_locks_deadlock()--whereas all posix lock requests are
checked for possible deadlocks *before* they are added to
When there is a circular dependency in blocked_list,
posix_locks_deadlock gets stuck in that circle.
The fix is to not add flock requests to blocked_list when they are
blocked. blocked_list is only used to check for possible deadlocks,
which should only be done for posix locks. Here's a patch:
--- locks.c.orig 2004-09-14 11:12:26.000000000 -0500
+++ locks.c 2004-09-14 11:13:32.000000000 -0500
@@ -459,7 +459,8 @@ static void locks_insert_block(struct fi
waiter->fl_next = blocker;
- list_add(&waiter->fl_link, &blocked_list);
+ if (IS_POSIX(blocker))
+ list_add(&waiter->fl_link, &blocked_list);
/* Wake up processes blocked waiting for blocker.
Version-Release number of selected component (if applicable):
reproducible every time with the right setup, but it's hard to get
the right setup.
Steps to Reproduce:
1.connect a SCSI drive and a USB hard drive to a server
2.share the SCSI drive, the USB drive, and a RAM drive with samba
3.connect 30 clients to the server, make each run 3 threads of
network stress--one to each of the three samba shares
4.system fails in an hour or so
system hangs, but SysRq is generally still functional. SysRq shows
that one task is stuck in posix_locks_deadlock(), while others are
waiting for the big kernel lock that is held by posix_locks_deadlock
system should continue to run without problems.
------- Additional comment by Jason Baron on 2004.10.21 14:48 -------
this patch is included the lattest beta of rhel4.
Why are you reporting a RHEL3 bug for kernel 2.6.7-1.451.2.3 ?
This bug was reported against kernel-2.4.21-20.EL.
This problem was already fixed in U4 (on 22-Sep-2004 in kernel
version 2.4.21-20.10.EL). Please verify fix in current U4 beta.
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.