Bug 1220270 - nfs-ganesha: Rename fails while exectuing Cthon general category test
Summary: nfs-ganesha: Rename fails while exectuing Cthon general category test
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: ganesha-nfs
Version: 3.7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Soumya Koduri
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-11 07:27 UTC by Saurabh
Modified: 2017-03-08 10:49 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-08 10:49:39 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Saurabh 2015-05-11 07:27:08 UTC
Description of problem:
I was executing the genral category case of cthon and it failed while trying to rename a file. The error thrown is "Remote I/O error"

Error as shown,
mv: cannot move `tbl.new' to `tbl.time': Remote I/O error

Version-Release number of selected component (if applicable):
glusterfs-3.7.0beta1-0.69.git1a32479.el6.x86_64
nfs-ganesha-2.2.0-0.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. create a volume of 6x2, start it
2. create a volume used by nfs0ganesha, called as gluster_shared_storage
3. so nfs-ganehsa setup as required.
4. on a client execute the cthon general category test

Actual results:
[root@rhsauto005 cthon04]# time ./server -g -o vers=3 -p /vol2 -m /mnt -N 1 10.70.37.148
Start tests on path /mnt/rhsauto005.test [y/n]? y

sh ./runtests  -g  /mnt/rhsauto005.test

GENERAL TESTS: directory /mnt/rhsauto005.test
if test ! -x runtests; then chmod a+x runtests; fi
cd /mnt/rhsauto005.test; rm -f Makefile runtests runtests.wrk *.sh *.c mkdummy rmdummy nroff.in makefile.tst
cp Makefile runtests runtests.wrk *.sh *.c mkdummy rmdummy nroff.in makefile.tst /mnt/rhsauto005.test

Small Compile
	0.0 (0.0) real	0.0 (0.0) user	0.0 (0.0) sys

Tbl
mv: cannot move `tbl.new' to `tbl.time': Remote I/O error
general tests failed
Tests failed, leaving /mnt mounted

logs from gfapi.log,
/tmp/gfapi.log	
[2015-05-11 07:17:55.173792] W [glfs-handleops.c:1166:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 0a2c9004-03c8-4239-afb8-1a27858e60c4 failed: Stale file handle
[2015-05-11 07:17:55.176435] W [glfs-handleops.c:1166:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of 0a2c9004-03c8-4239-afb8-1a27858e60c4 failed: Stale file handle
[2015-05-11 07:17:55.188553] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-4: remote operation failed: Stale file handle
[2015-05-11 07:17:55.191752] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-2: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)
[2015-05-11 07:17:55.209885] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-7: remote operation failed: Stale file handle
[2015-05-11 07:17:55.227805] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-4: remote operation failed: Stale file handle
[2015-05-11 07:17:55.244600] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-8: remote operation failed: Stale file handle
[2015-05-11 07:17:55.247906] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-4: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)
[2015-05-11 07:17:55.262054] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-8: remote operation failed: Stale file handle
[2015-05-11 07:17:55.292461] W [client-rpc-fops.c:1092:client3_3_getxattr_cbk] 0-vol2-client-2: remote operation failed: Operation not permitted. Path: /rhsauto005.test/testdir/SBAR (ec7e8bfa-665e-4654-9719-6c8198831943). Key: user.nfsv4_acls
[2015-05-11 07:17:55.294499] W [client-rpc-fops.c:1092:client3_3_getxattr_cbk] 0-vol2-client-3: remote operation failed: Operation not permitted. Path: /rhsauto005.test/testdir/SBAR (ec7e8bfa-665e-4654-9719-6c8198831943). Key: user.nfsv4_acls
[2015-05-11 07:17:55.313504] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-8: remote operation failed: Stale file handle
[2015-05-11 07:17:55.331982] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-3: remote operation failed: Stale file handle
[2015-05-11 07:17:55.335560] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-1: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)
[2015-05-11 07:17:55.350887] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-8: remote operation failed: Stale file handle
[2015-05-11 07:17:55.369383] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-4: remote operation failed: Stale file handle
[2015-05-11 07:17:55.386598] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-2: remote operation failed: Stale file handle
[2015-05-11 07:17:55.403053] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-8: remote operation failed: Stale file handle
[2015-05-11 07:17:55.429110] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-2: remote operation failed: Stale file handle
[2015-05-11 07:17:55.447778] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-6: remote operation failed: Stale file handle
[2015-05-11 07:17:55.465159] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-0: remote operation failed: Stale file handle
[2015-05-11 07:17:55.467930] W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol2-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)
[2015-05-11 07:17:55.481460] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-0: remote operation failed: Stale file handle
[2015-05-11 07:17:55.498392] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-0: remote operation failed: Stale file handle
[2015-05-11 07:17:56.266901] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-6: remote operation failed: Stale file handle
[2015-05-11 07:17:56.390500] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-6: remote operation failed: Stale file handle
[2015-05-11 07:17:56.522664] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-7: remote operation failed: Stale file handle
[2015-05-11 07:17:56.658674] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-6: remote operation failed: Stale file handle
[2015-05-11 07:17:56.746753] W [client-rpc-fops.c:506:client3_3_stat_cbk] 0-vol2-client-8: remote operation failed: Stale file handle
[2015-05-11 07:17:57.040175] I [dht-rename.c:1340:dht_rename] 0-vol2-dht: renaming /rhsauto005.test/tbl.new (hash=vol2-replicate-4/cache=vol2-replicate-4) => /rhsauto005.test/tbl.time (hash=vol2-replicate-1/cache=vol2-replicate-1)
[2015-05-11 07:17:57.058942] W [client-rpc-fops.c:2826:client3_3_lookup_cbk] 0-vol2-client-2: remote operation failed: No such file or directory. Path: <gfid:421a5ac3-c425-4c5d-85ad-1c0274065891> (421a5ac3-c425-4c5d-85ad-1c0274065891)
[2015-05-11 07:17:57.058965] W [client-rpc-fops.c:2826:client3_3_lookup_cbk] 0-vol2-client-3: remote operation failed: No such file or directory. Path: <gfid:421a5ac3-c425-4c5d-85ad-1c0274065891> (421a5ac3-c425-4c5d-85ad-1c0274065891)


Expected results:
test is suppose to pass, as rename should not give issue

Additional info:

[root@nfs1 ~]# gluster volume status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.148:/rhs/brick1/d1r1-share   49155     0          Y       17882
Brick 10.70.37.77:/rhs/brick1/d1r2-share    49155     0          Y       5416 
Brick 10.70.37.76:/rhs/brick1/d2r1-share    49155     0          Y       20946
Brick 10.70.37.69:/rhs/brick1/d2r2-share    49155     0          Y       19806
Brick 10.70.37.148:/rhs/brick1/d3r1-share   49156     0          Y       17899
Brick 10.70.37.77:/rhs/brick1/d3r2-share    49156     0          Y       5433 
Brick 10.70.37.76:/rhs/brick1/d4r1-share    49156     0          Y       20963
Brick 10.70.37.69:/rhs/brick1/d4r2-share    49156     0          Y       19823
Brick 10.70.37.148:/rhs/brick1/d5r1-share   49157     0          Y       17916
Brick 10.70.37.77:/rhs/brick1/d5r2-share    49157     0          Y       5450 
Brick 10.70.37.76:/rhs/brick1/d6r1-share    49157     0          Y       20980
Brick 10.70.37.69:/rhs/brick1/d6r2-share    49157     0          Y       19840
Self-heal Daemon on localhost               N/A       N/A        Y       7758 
Self-heal Daemon on 10.70.37.76             N/A       N/A        Y       26654
Self-heal Daemon on 10.70.37.77             N/A       N/A        Y       27866
Self-heal Daemon on 10.70.37.69             N/A       N/A        Y       10132
 
Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.148:/rhs/brick1/d1r1         49152     0          Y       7705 
Brick 10.70.37.77:/rhs/brick1/d1r2          49152     0          Y       27812
Brick 10.70.37.76:/rhs/brick1/d2r1          49152     0          Y       26599
Brick 10.70.37.69:/rhs/brick1/d2r2          49152     0          Y       10080
Brick 10.70.37.148:/rhs/brick1/d3r1         49153     0          Y       7722 
Brick 10.70.37.77:/rhs/brick1/d3r2          49153     0          Y       27829
Brick 10.70.37.76:/rhs/brick1/d4r1          49153     0          Y       26616
Brick 10.70.37.69:/rhs/brick1/d4r2          49153     0          Y       10097
Brick 10.70.37.148:/rhs/brick1/d5r1         49154     0          Y       7739 
Brick 10.70.37.77:/rhs/brick1/d5r2          49154     0          Y       27846
Brick 10.70.37.76:/rhs/brick1/d6r1          49154     0          Y       26633
Brick 10.70.37.69:/rhs/brick1/d6r2          49154     0          Y       10114
Self-heal Daemon on localhost               N/A       N/A        Y       7758 
Self-heal Daemon on 10.70.37.76             N/A       N/A        Y       26654
Self-heal Daemon on 10.70.37.77             N/A       N/A        Y       27866
Self-heal Daemon on 10.70.37.69             N/A       N/A        Y       10132
 
Task Status of Volume vol2
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@nfs1 ~]# gluster volume info
 
Volume Name: gluster_shared_storage
Type: Distributed-Replicate
Volume ID: 15f496d2-65ad-48b7-9cc4-1b17a47525ed
Status: Started
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.37.148:/rhs/brick1/d1r1-share
Brick2: 10.70.37.77:/rhs/brick1/d1r2-share
Brick3: 10.70.37.76:/rhs/brick1/d2r1-share
Brick4: 10.70.37.69:/rhs/brick1/d2r2-share
Brick5: 10.70.37.148:/rhs/brick1/d3r1-share
Brick6: 10.70.37.77:/rhs/brick1/d3r2-share
Brick7: 10.70.37.76:/rhs/brick1/d4r1-share
Brick8: 10.70.37.69:/rhs/brick1/d4r2-share
Brick9: 10.70.37.148:/rhs/brick1/d5r1-share
Brick10: 10.70.37.77:/rhs/brick1/d5r2-share
Brick11: 10.70.37.76:/rhs/brick1/d6r1-share
Brick12: 10.70.37.69:/rhs/brick1/d6r2-share
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
nfs-ganesha: enable
 
Volume Name: vol2
Type: Distributed-Replicate
Volume ID: 043bdf3e-7af3-423c-98c9-a505ff2b5557
Status: Started
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: 10.70.37.148:/rhs/brick1/d1r1
Brick2: 10.70.37.77:/rhs/brick1/d1r2
Brick3: 10.70.37.76:/rhs/brick1/d2r1
Brick4: 10.70.37.69:/rhs/brick1/d2r2
Brick5: 10.70.37.148:/rhs/brick1/d3r1
Brick6: 10.70.37.77:/rhs/brick1/d3r2
Brick7: 10.70.37.76:/rhs/brick1/d4r1
Brick8: 10.70.37.69:/rhs/brick1/d4r2
Brick9: 10.70.37.148:/rhs/brick1/d5r1
Brick10: 10.70.37.77:/rhs/brick1/d5r2
Brick11: 10.70.37.76:/rhs/brick1/d6r1
Brick12: 10.70.37.69:/rhs/brick1/d6r2
Options Reconfigured:
ganesha.enable: on
performance.readdir-ahead: on
nfs.disable: on
nfs-ganesha: enable


[root@nfs1 ~]# ps -eaf | grep nfs
root      8057     1  0 11:38 ?        00:00:00 /usr/sbin/glusterfs --volfile-server=nfs1 --volfile-id=/gluster_shared_storage /var/run/gluster/shared_storage
root      8170     1 13 11:40 ?        00:07:15 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid
root     31827 25794  0 12:34 pts/0    00:00:00 grep nfs
[root@nfs1 ~]# showmount -e localhost
Export list for localhost:
/vol2 (everyone)

Comment 2 Kaushal 2017-03-08 10:49:39 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.


Note You need to log in before you can comment on or make changes to this bug.