Red Hat Bugzilla – Bug 861308
lookup blocked while waiting for self-heal that fails due to pre-existing locks
Last modified: 2014-12-14 14:40:29 EST
Description of problem:
We replaced a server and there were, apparently, stale inode locks. Directory listings, or stat to specific filenames that were affected, caused the client to hang. The only way to release those calls were to force-unmount the client.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. File with stale inode lock in with pending attributes
2. lookup() the file
Client is locked up
At least an error should have been returned
Could you give us the steps to re-create the issue, as in how to end up with stale-locks?. Once there are stale locks I understand that we will observe hangs. Going into the stale-locks state is the important thing to re-create.
Any updates on this issue?. We would like to close the bug if the steps are not available :-|.
No clue how I had stale locks, but the point is that even with those locks, applications shouldn't wait in zombie state waiting for that lookup. If a response cannot be returned to the lookup, at least throw an error.
We came across this yesterday, too, where a user had libvirtd with zombie status until a vm image self-heal was completed (more than 4 hours). Not knowing why your apparently hung for 4 hours is unacceptable.
Self-heal does 16 self-heals in the background per a replica pair by default.
This is to prevent performance problems. You can either increase this number or set cluster.data-self-heal to "off" to prevent this from happening. (Note: This does not disable data-self-heal on self-heal-daemon)
Could you let us know if this information is sufficient?
Why can't we just give the application it's data? We've queued the background self-heal, we know which copy is good; can't we give that data to the application at that point?
If this was a write operation, I could understand the difficulty, but this is a read operation that the strace is showing it stuck in.
Getting stuck in readv is NEW!!, readv does not even take any locks, it directly goes to the brick and then responds with whatever data is given by the brick. I need more information to figure this out. Do you know how to re-create this issue, where it gets stuck in readv? Do you happen to have the statedump of the mount, bricks when this happened?
Readv can trigger a self-heal but that does not block the readv fop.
I don't know how the stale locks happened. The lock was there long before I was able to recognize there was an issue. I don't know how the locks are produced or cleared under normal circumstances so I can't even speculate.
Comment 6 was, I think, a bit of a red herring. I responded to comment 4 without carefully going back through the bug and putting it in the correct context. Comment 4 doesn't actually make any sense with respect to the original bug report which was about lookup() and inode locks.
Should lookup() be blocked if there is an inode lock on the target file? Since that block even affects a common ls of the directory, I wouldn't have expected it to. (By common I mean the default ls of most distros which includes color or decoration which requires a stat call on the file triggering that lookup.)
I wouldn't have expected that behavior. What's the harm in responding to a lookup with one good replica? lookup isn't a write operation if I'm understanding the purpose of that function call correctly.
Again, I don't know how to create stale inode locks (or even active ones for that matter) so I have no idea how to reproduce.
By "stale", I mean that there were no applications using that volume and it was only mounted on one client. After I cleared the inode locks, the system worked normally.
We are observing a new bug which fits the description at least readdir hang in the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=959212#c8 has precise steps to see the issue (re-creatable 100% of the times). Could you check and let us know if what you observed and bug 959212 are possible duplicates?
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.
If there has been no update before 9 December 2014, this bug will get automatocally closed.