Bug 802885 - when nfs server is restarted, reclaim locks held by write operations on a file from nfs mount.
when nfs server is restarted, reclaim locks held by write operations on a fil...
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
mainline
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Vinayaga Raman
Saurabh
:
: 802767 (view as bug list)
Depends On:
Blocks: 817967
  Show dependency treegraph
 
Reported: 2012-03-13 12:54 EDT by Shwetha Panduranga
Modified: 2016-01-19 01:10 EST (History)
6 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-24 13:26:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions: 3.3.0qa35
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
nfs server log (349.19 KB, text/x-log)
2012-03-13 12:54 EDT, Shwetha Panduranga
no flags Details

  None (edit)
Description Shwetha Panduranga 2012-03-13 12:54:37 EDT
Created attachment 569729 [details]
nfs server log

Description of problem:
When volume is restarted, nfs server restarts and currently nlm server is not reclaiming locks previously held by applications on files before the restart. Hence an unlock doesn't find a corresponding lock and getting the error : 

[2012-03-13 23:09:43.879988] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.880096] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume

Version-Release number of selected component (if applicable):
3.3.0qa27

How reproducible:
often

Steps to Reproduce:
1.create a distribute-replicate volume(2 X 3). Start the volume. 
2.create nfs mounts from client
3.start "locktest -n 500 -f file1" 
4.Bring down one brick
5.Bring back the brick while locktest still in progress.

Actual results:
[2012-03-13 23:00:33.369591] I [client-handshake.c:1533:select_server_supported_programs] 0-dstore1-client-3: Using Program GlusterFS 3.3.0qa27, Num (1298437), Version (330)
[2012-03-13 23:00:33.370580] I [client-handshake.c:1308:client_setvolume_cbk] 0-dstore1-client-3: clnt-lk-version = 1, server-lk-version = 0
[2012-03-13 23:00:33.370620] I [client-handshake.c:1334:client_setvolume_cbk] 0-dstore1-client-3: Connected to 192.168.2.35:24010, attached to remote volume '/export2/dstore1'.
[2012-03-13 23:01:27.189309] I [afr-common.c:1313:afr_launch_self_heal] 0-dstore1-replicate-1: background  data self-heal triggered. path: <gfid:00000000-0000-0000-0000-000000000000>, reason: lookup detected pending operations
[2012-03-13 23:09:19.731434] I [afr-self-heal-algorithm.c:131:sh_loop_driver_done] 0-dstore1-replicate-1: diff self-heal on <gfid:00000000-0000-0000-0000-000000000000>: completed. (1 blocks of 81920 were different (0.00%))
[2012-03-13 23:09:19.890119] I [afr-self-heal-common.c:2037:afr_self_heal_completion_cbk] 0-dstore1-replicate-1: background  data self-heal completed on <gfid:00000000-0000-0000-0000-000000000000>
[2012-03-13 23:09:43.879988] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.880096] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume
[2012-03-13 23:09:43.880252] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.880305] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume
[2012-03-13 23:09:43.880597] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.880653] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume
[2012-03-13 23:09:43.880789] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.880854] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume
[2012-03-13 23:09:43.880992] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.881054] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume
[2012-03-13 23:09:43.881184] E [nlm4.c:1595:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL
[2012-03-13 23:09:43.881237] E [nlm4.c:1607:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume
Comment 1 Krishna Srinivas 2012-04-03 04:00:05 EDT
*** Bug 802767 has been marked as a duplicate of this bug. ***
Comment 2 Saurabh 2012-04-16 08:12:15 EDT
Tests executed are,
1. gnfs restart while locks are held and unlock happened.
2. client restart while locks are held and fresh lock request passed.
3. server reboot while locks are held and the unlock happened after the system comes back.
Comment 3 Anand Avati 2012-04-17 11:26:16 EDT
CHANGE: http://review.gluster.com/3096 (nlm: send sm-notify to clients whenever the nfs server is restarted so that clients reclaim the locks.) merged in master by Vijay Bellur (vijay@gluster.com)

Note You need to log in before you can comment on or make changes to this bug.