Bug 821138 - Look up of files/Dir with no gfid, is unable to trigger self-heal
Look up of files/Dir with no gfid, is unable to trigger self-heal
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: fuse (Show other bugs)
pre-release
x86_64 Linux
low Severity high
: ---
: ---
Assigned To: Raghavendra G
: Triaged
Depends On:
Blocks: 848347
  Show dependency treegraph
 
Reported: 2012-05-12 08:17 EDT by Ujjwala
Modified: 2013-08-19 20:09 EDT (History)
5 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 848347 (view as bug list)
Environment:
Last Closed: 2013-07-24 13:46:43 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fuse mount log (752.60 KB, text/x-log)
2012-05-12 08:17 EDT, Ujjwala
no flags Details
Brick log (271.61 KB, text/x-log)
2012-05-12 08:46 EDT, Ujjwala
no flags Details

  None (edit)
Description Ujjwala 2012-05-12 08:17:08 EDT
Created attachment 583988 [details]
Fuse mount log

Description of problem:

When you do the lookup of file/dir with no gfid, on the mount point, it does not trigger the self-heal.

Version-Release number of selected component (if applicable):
3.3.0 qa41

How reproducible:
Tried 3 times, was reproducible all the 3 times

Steps to Reproduce:
1. Create a 1x2 rep volume and do a cifs mount.
2. In the backend of the first brick, create a file - file2
3. On the mount point, do 'ls -lh file1'
Note: Behavior is same on the fuse mount also.
Attached is the fuse mount log file
  
Actual results:
[root@gqac003 dis-rep_cifs]# ls -lh file2
ls: cannot access file2: No such file or directory


Expected results:
Lookup should trigger self-heal and complete the self-heal

Additional info:

[2012-05-12 16:57:49.372300] E [afr-common.c:1859:afr_lookup_done] 3-dis-rep-replicate-2: /file2: No gfid present
[2012-05-12 16:57:49.372391] W [fuse-resolve.c:89:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000000/file2: failed to resolve (No data available)
[2012-05-12 16:57:49.408001] W [socket.c:195:__socket_rwv] 3-dis-rep-client-4: readv failed (Connection reset by peer)
[2012-05-12 16:57:49.408034] W [socket.c:1512:__socket_proto_state_machine] 3-dis-rep-client-4: reading from socket failed. Error (Connection reset by peer), peer (10.16.157.0:24026)
[2012-05-12 16:57:49.408114] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x13c) [0x7feee8d59c4e] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7feee8d5916d] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7feee8d58bb3]))) 3-dis-rep-client-4: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-05-12 16:57:49.372676 (xid=0x144000x)
[2012-05-12 16:57:49.408136] W [client3_1-fops.c:2629:client3_1_lookup_cbk] 3-dis-rep-client-4: remote operation failed: Transport endpoint is not connected. Path: /file2 (00000000-0000-0000-0000-000000000000)
[2012-05-12 16:57:49.410933] I [socket.c:2315:socket_submit_request] 3-dis-rep-client-4: not connected (priv->connected = 0)
[2012-05-12 16:57:49.410958] W [rpc-clnt.c:1498:rpc_clnt_submit] 3-dis-rep-client-4: failed to submit rpc-request (XID: 0x144001x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (dis-rep-client-4)
[2012-05-12 16:57:49.410974] W [client3_1-fops.c:2629:client3_1_lookup_cbk] 3-dis-rep-client-4: remote operation failed: Transport endpoint is not connected. Path: /file2 (00000000-0000-0000-0000-000000000000)
[2012-05-12 16:57:49.411033] I [client.c:2090:client_rpc_notify] 3-dis-rep-client-4: disconnected
[2012-05-12 16:57:49.411135] E [socket.c:1715:socket_connect_finish] 3-dis-rep-client-4: connection to 10.16.157.0:24026 failed (Connection refused)
[2012-05-12 17:23:23.625315] E [afr-common.c:1859:afr_lookup_done] 3-dis-rep-replicate-1: /file3: No gfid present
[2012-05-12 17:23:23.625363] W [fuse-resolve.c:89:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000000/file3: failed to resolve (No data available)
[2012-05-12 17:23:23.688003] W [socket.c:1512:__socket_proto_state_machine] 3-dis-rep-client-2: reading from socket failed. Error (Transport endpoint is not connected), peer (10.16.157.0:24017)
[2012-05-12 17:23:23.688134] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x13c) [0x7feee8d59c4e] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x155) [0x7feee8d5916d] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7feee8d58bb3]))) 3-dis-rep-client-2: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-05-12 17:23:23.625770 (xid=0x138714x)
[2012-05-12 17:23:23.688158] W [client3_1-fops.c:2629:client3_1_lookup_cbk] 3-dis-rep-client-2: remote operation failed: Transport endpoint is not connected. Path: /file3 (00000000-0000-0000-0000-000000000000)
[2012-05-12 17:23:23.690774] I [socket.c:2315:socket_submit_request] 3-dis-rep-client-2: not connected (priv->connected = 0)
[2012-05-12 17:23:23.690797] W [rpc-clnt.c:1498:rpc_clnt_submit] 3-dis-rep-client-2: failed to submit rpc-request (XID: 0x138715x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (dis-rep-client-2)
[2012-05-12 17:23:23.690821] W [client3_1-fops.c:2629:client3_1_lookup_cbk] 3-dis-rep-client-2: remote operation failed: Transport endpoint is not connected. Path: /file3 (00000000-0000-0000-0000-000000000000)
[2012-05-12 17:23:23.690920] I [client.c:2090:client_rpc_notify] 3-dis-rep-client-2: disconnected
[2012-05-12 17:23:23.690983] E [socket.c:1715:socket_connect_finish] 3-dis-rep-client-2: connection to 10.16.157.0:24017 failed (Connection refused)
Comment 1 Ujjwala 2012-05-12 08:20:30 EDT
Sorry, I had mentioned diff file names on step 2 and 3. Below are the steps:

Steps to Reproduce:
1. Create a 1x2 rep volume and do a cifs mount.
2. In the backend of the first brick, create a file - file2
3. On the mount point, do 'ls -lh file2'
Comment 2 Ujjwala 2012-05-12 08:46:16 EDT
When the file look up is done on the mount point the brick on which file was created crashes but there is no core generated.
Attached is the brick log.
Comment 3 Ujjwala 2012-05-12 08:46:55 EDT
Created attachment 583994 [details]
Brick log
Comment 4 Pranith Kumar K 2012-05-14 02:48:09 EDT
The issue happens even with out afr in the picture.
If the patch 27fb213be6101bca859502ac87dddc4cd0a6f272 is reverted it works fine.
Assigning the bug to du.
Comment 5 Pranith Kumar K 2012-11-09 00:54:41 EST
[root@pranithk-laptop ~]# cd /mnt/r2
[root@pranithk-laptop r2]# touch /gfs/r2_0/file1
[root@pranithk-laptop r2]# ls -l file1
-rw-r--r-- 1 root root 0 Nov  9 11:24 file1
[root@pranithk-laptop r2]# ls -l /gfs/r2_?
/gfs/r2_0:
total 4
-rw-r--r-- 2 root root 0 Nov  9 11:24 file1

/gfs/r2_1:
total 4
-rw-r--r-- 2 root root 0 Nov  9 11:24 file1

[root@pranithk-laptop r2]# getfattr -d -m . -e hex /gfs/r2_?/file1
getfattr: Removing leading '/' from absolute path names
# file: gfs/r2_0/file1
trusted.afr.r2-client-0=0x000000000000000000000000
trusted.afr.r2-client-1=0x000000000000000000000000
trusted.gfid=0x20c07a217aeb4f32a813310b7518b63d

# file: gfs/r2_1/file1
trusted.afr.r2-client-0=0x000000000000000000000000
trusted.afr.r2-client-1=0x000000000000000000000000
trusted.gfid=0x20c07a217aeb4f32a813310b7518b63d

Test case works fine.

Note You need to log in before you can comment on or make changes to this bug.