Bug 802885
Summary: | when nfs server is restarted, reclaim locks held by write operations on a file from nfs mount. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Shwetha Panduranga <shwetha.h.panduranga> | ||||
Component: | nfs | Assignee: | Vinayaga Raman <vraman> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Saurabh <saujain> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | mainline | CC: | gluster-bugs, mzywusko, rfortier, rwheeler, saujain, vbellur | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.4.0 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-07-24 17:26:48 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | 3.3.0qa35 | Category: | --- | ||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 817967 | ||||||
Attachments: |
|
*** Bug 802767 has been marked as a duplicate of this bug. *** Tests executed are, 1. gnfs restart while locks are held and unlock happened. 2. client restart while locks are held and fresh lock request passed. 3. server reboot while locks are held and the unlock happened after the system comes back. CHANGE: http://review.gluster.com/3096 (nlm: send sm-notify to clients whenever the nfs server is restarted so that clients reclaim the locks.) merged in master by Vijay Bellur (vijay) |
Created attachment 569729 [details] nfs server log Description of problem: When volume is restarted, nfs server restarts and currently nlm server is not reclaiming locks previously held by applications on files before the restart. Hence an unlock doesn't find a corresponding lock and getting the error : [2012-03-13 23:09:43.879988] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.880096] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume Version-Release number of selected component (if applicable): 3.3.0qa27 How reproducible: often Steps to Reproduce: 1.create a distribute-replicate volume(2 X 3). Start the volume. 2.create nfs mounts from client 3.start "locktest -n 500 -f file1" 4.Bring down one brick 5.Bring back the brick while locktest still in progress. Actual results: [2012-03-13 23:00:33.369591] I [client-handshake.c:1533:select_server_supported_programs] 0-dstore1-client-3: Using Program GlusterFS 3.3.0qa27, Num (1298437), Version (330) [2012-03-13 23:00:33.370580] I [client-handshake.c:1308:client_setvolume_cbk] 0-dstore1-client-3: clnt-lk-version = 1, server-lk-version = 0 [2012-03-13 23:00:33.370620] I [client-handshake.c:1334:client_setvolume_cbk] 0-dstore1-client-3: Connected to 192.168.2.35:24010, attached to remote volume '/export2/dstore1'. [2012-03-13 23:01:27.189309] I [afr-common.c:1313:afr_launch_self_heal] 0-dstore1-replicate-1: background data self-heal triggered. path: <gfid:00000000-0000-0000-0000-000000000000>, reason: lookup detected pending operations [2012-03-13 23:09:19.731434] I [afr-self-heal-algorithm.c:131:sh_loop_driver_done] 0-dstore1-replicate-1: diff self-heal on <gfid:00000000-0000-0000-0000-000000000000>: completed. (1 blocks of 81920 were different (0.00%)) [2012-03-13 23:09:19.890119] I [afr-self-heal-common.c:2037:afr_self_heal_completion_cbk] 0-dstore1-replicate-1: background data self-heal completed on <gfid:00000000-0000-0000-0000-000000000000> [2012-03-13 23:09:43.879988] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.880096] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume [2012-03-13 23:09:43.880252] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.880305] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume [2012-03-13 23:09:43.880597] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.880653] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume [2012-03-13 23:09:43.880789] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.880854] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume [2012-03-13 23:09:43.880992] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.881054] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume [2012-03-13 23:09:43.881184] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2012-03-13 23:09:43.881237] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume