Bug 803306 - when nfs server fails to lookup root, it disables the volume
Summary: when nfs server fails to lookup root, it disables the volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: mainline
Hardware: All
OS: All
high
high
Target Milestone: ---
Assignee: rjoseph
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 852569
TreeView+ depends on / blocked
 
Reported: 2012-03-14 12:08 UTC by Shwetha Panduranga
Modified: 2013-09-01 03:31 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 852569 (view as bug list)
Environment:
Last Closed: 2013-09-01 03:31:01 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
nfs server log (744.57 KB, text/x-log)
2012-03-14 12:08 UTC, Shwetha Panduranga
no flags Details

Description Shwetha Panduranga 2012-03-14 12:08:55 UTC
Created attachment 569968 [details]
nfs server log

Description of problem:
With volume restart, nfs server is restarted and nfs server performs a lookup on root. If the lookup fails , nfs server disables the volume and further doesn't retry to perform lookups . Because of this the write operations from nfs mount hangs. 

Version-Release number of selected component (if applicable):
3.3.0qa27

How reproducible:
often

Steps to Reproduce:
1.create a replicate volume (1 X 3)
2.create fuse, nfs mounts from client
3.perform write operations from both the mount points
4.bring down a brick
5.bring back the brick

look up on root inode might fail because of the afr bug: 800755. 
If the lookup fails, nfs server disables the volume and any operations on nfs mount hangs. 

Actual results:
[2012-03-14 22:01:52.375659] I [client-handshake.c:1334:client_setvolume_cbk] 0-dstore1-client-0: Connected to 192.168.2.35:24009, attached to remote volume '/export1/dstore1'.
[2012-03-14 22:01:52.377465] I [afr-common.c:3484:afr_notify] 0-dstore1-replicate-0: Subvolume 'dstore1-client-0' came back up; going online.
[2012-03-14 22:01:52.379781] W [client.c:1992:client_rpc_notify] 0-dstore1-client-1: Cancelling the grace timer
[2012-03-14 22:01:52.379958] I [client-handshake.c:1533:select_server_supported_programs] 0-dstore1-client-1: Using Program GlusterFS 3.3.0qa27, Num (1298437), Version (330)
[2012-03-14 22:01:52.380454] I [client-handshake.c:1308:client_setvolume_cbk] 0-dstore1-client-1: clnt-lk-version = 1, server-lk-version = 0
[2012-03-14 22:01:52.380494] I [client-handshake.c:1334:client_setvolume_cbk] 0-dstore1-client-1: Connected to 192.168.2.36:24009, attached to remote volume '/export1/dstore1'.
[2012-03-14 22:01:53.384601] W [client.c:1992:client_rpc_notify] 0-dstore1-client-2: Cancelling the grace timer
[2012-03-14 22:01:53.384978] I [client-handshake.c:1533:select_server_supported_programs] 0-dstore1-client-2: Using Program GlusterFS 3.3.0qa27, Num (1298437), Version (330)
[2012-03-14 22:01:53.385493] I [client-handshake.c:1308:client_setvolume_cbk] 0-dstore1-client-2: clnt-lk-version = 1, server-lk-version = 0
[2012-03-14 22:01:53.385542] I [client-handshake.c:1334:client_setvolume_cbk] 0-dstore1-client-2: Connected to 192.168.2.37:24009, attached to remote volume '/export1/dstore1'.
[2012-03-14 22:01:53.386526] I [afr-common.c:1850:afr_set_root_inode_on_first_lookup] 0-dstore1-replicate-0: added root inode
[2012-03-14 22:01:53.386892] C [nfs.c:257:nfs_start_subvol_lookup_cbk] 0-nfs: Failed to lookup root: Input/output error
[2012-03-14 22:02:49.810301] E [nfs3.c:5029:nfs3_commit] 0-nfs-nfsv3: Volume is disabled: dstore1
[2012-03-14 22:02:49.810394] W [rpcsvc.c:524:rpcsvc_handle_rpc_call] 0-rpcsvc: failed to queue error reply
[2012-03-14 22:04:49.810687] E [nfs3.c:5029:nfs3_commit] 0-nfs-nfsv3: Volume is disabled: dstore1
[2012-03-14 22:04:49.810798] W [rpcsvc.c:524:rpcsvc_handle_rpc_call] 0-rpcsvc: failed to queue error reply
[2012-03-14 22:05:49.810161] E [nfs3.c:5029:nfs3_commit] 0-nfs-nfsv3: Volume is disabled: dstore1
[2012-03-14 22:05:49.810255] W [rpcsvc.c:524:rpcsvc_handle_rpc_call] 0-rpcsvc: failed to queue error reply
[2012-03-14 22:07:49.810159] E [nfs3.c:5029:nfs3_commit] 0-nfs-nfsv3: Volume is disabled: dstore1
[2012-03-14 22:07:49.810266] W [rpcsvc.c:524:rpcsvc_handle_rpc_call] 0-rpcsvc: failed to queue error reply
[2012-03-14 22:08:49.810302] E [nfs3.c:5029:nfs3_commit] 0-nfs-nfsv3: Volume is disabled: dstore1
[2012-03-14 22:08:49.810406] W [rpcsvc.c:524:rpcsvc_handle_rpc_call] 0-rpcsvc: failed to queue error reply
[2012-03-14 22:10:49.811087] E [nfs3.c:5029:nfs3_commit] 0-nfs-nfsv3: Volume is disabled: dstore1
[2012-03-14 22:10:49.811186] W [rpcsvc.c:524:rpcsvc_handle_rpc_call] 0-rpcsvc: failed to queue error reply


Note You need to log in before you can comment on or make changes to this bug.