Bug 1209725 - nfs-ganesha: stale file handle(ESTALE)
Summary: nfs-ganesha: stale file handle(ESTALE)
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: nfs-ganesha
Classification: Retired
Component: FSAL_GLUSTER
Version: 2.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Jiffin
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1214162
TreeView+ depends on / blocked
 
Reported: 2015-04-08 05:18 UTC by Saurabh
Modified: 2023-09-14 02:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-10 05:00:39 UTC
Embargoed:


Attachments (Terms of Use)

Description Saurabh 2015-04-08 05:18:17 UTC
Description of problem:
I was executing the ltp test over the nfs with version=3. 
Found that when ltp is about to finish, Stale file Handle is reported back


Version-Release number of selected component (if applicable):
glusterfs-3.7dev-0.770.git2035599.el6.x86_64
nfs-ganesha-2.2-0.rc7.el6.x86_64

How reproducible:
happen to be seen on this built

Steps to Reproduce:
1. have a 6x2 volume of gluster
2. have nfs-ganesha setup done
3. mount the volume with version=3
4. execute ltp test suite

Actual results:
logs from gfapi.log

[2015-04-07 06:29:23.463499] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.464522] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.468391] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.469390] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.471851] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.472736] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.473703] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
[2015-04-07 06:29:23.474572] W [glfs-handleops.c:1163:pub_glfs_h_create_from_handle] 0-meta-autoload: inode refresh of b287800c-cf61-4d3c-9fc6-f3d39e64eeee failed: Stale file handle
The message "W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol0-replicate-1: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)" repeated 1423 times between [2015-04-07 06:28:38.112601] and [2015-04-07 06:29:19.633391]
The message "W [MSGID: 108008] [afr-read-txn.c:237:afr_read_txn] 0-vol0-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain)" repeated 1508 times between [2015-04-07 06:28:38.141915] and [2015-04-07 06:29:19.641146]


Expected results:
no ESTALE expected

Additional info:
# gluster volume status
Status of volume: vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.36.45:/rhs/brick1/d1r1          49152     0          Y       17354
Brick 10.70.36.47:/rhs/brick1/d1r1          49152     0          Y       28220
Brick 10.70.36.45:/rhs/brick1/d2r1          49153     0          Y       17374
Brick 10.70.36.47:/rhs/brick1/d2r2          49153     0          Y       28240
Self-heal Daemon on localhost               N/A       N/A        Y       17397
Quota Daemon on localhost                   N/A       N/A        Y       17402
Self-heal Daemon on 10.70.36.47             N/A       N/A        Y       28263
Quota Daemon on 10.70.36.47                 N/A       N/A        Y       28268
 
Task Status of Volume vol0
------------------------------------------------------------------------------
There are no active volume tasks

Comment 1 Niels de Vos 2015-04-22 10:23:55 UTC
Jiffin, is this a nfs-ganesha bug, or a glusterfs one?

In case of nfs-ganesha, replace the blocker to bug 1214162.
If this is a glusterfs bug, then please move this bug to the GlusterFS product and the right component.

Thanks!

Comment 2 Jiffin 2015-04-24 06:20:29 UTC
I run the ltp test suite successfully and got the same log in gfapi for both nfsv3 and nfsv4 mounts

This log is generated because how the test works.

In the test, 

first a folder `ltp` is created and cd into the same.

One of test fsstress deletes that folder and after completing the all the test , they are trying to delete the same.

The same issue can be reproduced on the native fuse mount.

The following is result of the ltp test suite : 

executing ltp
start ltp tests:11:39:44
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//fs_perms/fs_perms_simpletest.sh

real	0m0.087s
user	0m0.014s
sys	0m0.050s
1
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//lftest/lftest

real	1m48.675s
user	0m0.029s
sys	0m5.383s
2
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//stream/stream01

real	0m0.006s
user	0m0.000s
sys	0m0.002s
3
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//stream/stream02

real	0m0.004s
user	0m0.000s
sys	0m0.003s
4
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//stream/stream03

real	0m0.004s
user	0m0.001s
sys	0m0.003s
5
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//stream/stream04

real	0m0.002s
user	0m0.001s
sys	0m0.001s
6
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//stream/stream05

real	0m0.002s
user	0m0.000s
sys	0m0.002s
7
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//openfile/openfile

real	0m0.007s
user	0m0.003s
sys	0m0.003s
8
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//inode/inode01

real	0m0.027s
user	0m0.001s
sys	0m0.006s
9
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//inode/inode02

real	0m0.087s
user	0m0.032s
sys	0m0.209s
10
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest01

real	0m5.374s
user	0m0.066s
sys	0m13.838s
11
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest02

real	0m0.420s
user	0m0.002s
sys	0m1.130s
12
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest03

real	0m0.097s
user	0m0.091s
sys	0m0.138s
13
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest04

real	0m0.059s
user	0m0.043s
sys	0m0.041s
14
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest05

real	0m5.338s
user	0m0.051s
sys	0m13.874s
15
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest06

real	0m0.420s
user	0m0.000s
sys	0m1.067s
16
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest07

real	0m4.680s
user	0m0.079s
sys	0m14.342s
17
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//ftest/ftest08

real	0m3.468s
user	0m0.030s
sys	0m10.131s
18
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//fsstress/fsstress

real	1m1.667s
user	0m0.219s
sys	0m4.124s
19
Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//fs_inod/fs_inod

real	4m35.266s
user	0m6.406s
sys	0m23.017s
20
end ltp tests: 11:47:29
total 20 tests were successful out of 20 tests
rm: cannot remove ‘ltp’: Stale file handle
1
Total 1 tests were successful
Switching over to the previous working directory

Comment 3 Red Hat Bugzilla 2023-09-14 02:57:41 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.