Bug 918944 - Deadlock in lk calls of stripe subvolume
Summary: Deadlock in lk calls of stripe subvolume
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: stripe
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-07 09:38 UTC by Pranith Kumar K
Modified: 2015-10-22 15:46 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-10-22 15:46:38 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Pranith Kumar K 2013-03-07 09:38:37 UTC
Description of problem:
When the following command is run at some point there will be a dead lock in setlkw calls at the bricks because of order of arrival.

execute the following command.
while tests/bugs/bug-905864.t; do :; done
after an hour or so the test hangs and when we observe the statedump we see that in two stripe subolumes the order of arrival of setlkws is swapped which causes a dead lock.

On brick-5 here are the locks:
[xlator.features.locks.patchy-locks.inode]
path=/file1
mandatory=0
posixlk-count=2
posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=2, pid = 32312, owner=c753d2d946fd5b45, transport=0x150cf30, , granted at Thu Mar  7 14:03:57 2013

posixlk.posixlk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=2, pid = 32309, owner=9a4f25833ec8c200, transport=0x150d5b0, , blocked at Thu Mar  7 14:03:57 2013

On brick7 these are the locks:
[xlator.features.locks.patchy-locks.inode]
path=/file1
mandatory=0
posixlk-count=2
posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=2, pid = 32309, owner=9a4f25833ec8c200, transport=0xe82b90, , granted at Thu Mar  7 14:03:57 2013

posixlk.posixlk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=2, pid = 32312, owner=c753d2d946fd5b45, transport=0xe84480, , blocked at Thu Mar  7 14:03:57 2013

As we can see this causes a dead lock.

Version-Release number of selected component (if applicable):


How reproducible:
1/1

Steps to Reproduce:
1.As per the description
2.
3.
  
Actual results:
Tests hung

Expected results:
Tests should not hang.

Additional info:

Comment 1 Kaleb KEITHLEY 2015-10-22 15:46:38 UTC
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.


Note You need to log in before you can comment on or make changes to this bug.