Bug 849128

Summary: [glusterfs-3.3.0qa18] - Unable to remove the files from mountpoint after rdma sanity run
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vidya Sakar <vinaraya>
Component: rdmaAssignee: Raghavendra G <rgowdapp>
Status: CLOSED CURRENTRELEASE QA Contact: shylesh <shmohan>
Severity: medium Docs Contact:
Priority: low    
Version: 2.0CC: aavati, gluster-bugs, rwheeler, sdharane, surs, vagarwal, vbhat
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 770833
: 858451 (view as bug list) Environment:
Last Closed: 2015-02-13 09:49:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 770833    
Bug Blocks: 858451    

Description Vidya Sakar 2012-08-17 11:53:58 UTC
+++ This bug was initially created as a clone of Bug #770833 +++

Description of problem:
I'm unable to remove the files created by rdma sanity run from the mountpoint. When try to remove I get following errors.

[root@client4 mnt]# rm -rf run1994/
rm: cannot remove `run1994/p8/d0/d3': Directory not empty
rm: cannot remove `run1994/p15': Directory not empty
rm: cannot remove `run1994/p13/d2': Directory not empty
rm: cannot remove `run1994/pc/d0/d2': Directory not empty
rm: cannot remove `run1994/pa/d1': Directory not empty
rm: cannot remove `run1994/p2': Directory not empty
[root@client4 mnt]#

When I try to remove any file inside those directories, I get following error.

[root@client4 d2]# ls
f6
[root@client4 d2]# rm f6
rm: remove regular file `f6'? y
rm: cannot remove `f6': No such file or directory
[root@client4 d2]# ls
f6
[root@client4 d2]#

Even though the file f6 is present, I get no such file or directory.
Version-Release number of selected component (if applicable):
glusterfs-3.3.0qa18

Steps to Reproduce:
1. Create a 2 way distribute volume with rdma transport
2. Try to remove the files it created.
  
Actual results:
Unable to remove the files.

Expected results:
Files should be deleted when rm -rf is issued on directory.

Additional info:

I see following errors in client log.


[2011-12-29 02:50:13.819765] C [client-handshake.c:121:rpc_client_ping_timer_expired] 0-hosdu-client-0: server 10.1.10.21:24009 has not responded in the last 42 seconds, disconnecting.
[2011-12-29 02:50:46.219610] E [rpc-clnt.c:380:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x186) [0x7f1de08665d5] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x1c5) [0x7f1de08654d6] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x45) [0x7f1de0864c0e]))) 0-hosdu-client-0: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2011-12-29 02:49:30.816174
[2011-12-29 02:50:46.219662] W [client-handshake.c:265:client_ping_cbk] 0-hosdu-client-0: timer must have expired
[2011-12-29 02:50:46.271731] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x2758286x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0)
[2011-12-29 02:50:46.271796] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run1994/sbench.707
[2011-12-29 02:50:46.271800] I [client.c:1885:client_rpc_notify] 0-hosdu-client-0: disconnected
[2011-12-29 02:50:46.271833] W [fuse-bridge.c:745:fuse_fd_cbk] 0-glusterfs-fuse: 4696521: OPEN() /run1994/sbench.707 => -1 (Transport endpoint is not connected)
[2011-12-29 02:50:46.276205] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x2758287x Program: GlusterFS 3.1, ProgVers: 310, Proc: 1) to rpc-transport (hosdu-client-0)
[2011-12-29 02:50:46.276234] W [client3_1-fops.c:418:client3_1_stat_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected
[2011-12-29 02:50:46.337293] E [client-handshake.c:1166:client_query_portmap_cbk] 0-hosdu-client-0: failed to get the port number for remote subvolume
[2011-12-29 02:50:46.362511] I [rpc-clnt.c:1597:rpc_clnt_reconfig] 0-hosdu-client-0: changing port to 24009 (from 0)
[2011-12-29 02:50:46.395674] I [client-handshake.c:1085:select_server_supported_programs] 0-hosdu-client-0: Using Program GlusterFS 3.3.0qa18, Num (1298437), Version (310)
[2011-12-29 02:50:46.399241] I [client-handshake.c:917:client_setvolume_cbk] 0-hosdu-client-0: Connected to 10.1.10.21:24009, attached to remote volume '/tmp/brick'.
[2011-12-29 04:04:56.975141] I [dict.c:336:dict_get] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/client.so(client3_1_readdirp_cbk+0x4aa) [0x7f1ddca0a597] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_readdirp_cbk+0xb2) [0x7f1ddc7a83ed] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_is_subvol_empty+0x111) [0x7f1ddc7a7c77]))) 0-dict: !this || key=trusted.glusterfs.dht.linkto
[2011-12-29 04:05:25.252896] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:05:25.252949] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4795433: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)
[2011-12-29 04:06:22.880149] I [dict.c:336:dict_get] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/client.so(client3_1_readdirp_cbk+0x4aa) [0x7f1ddca0a597] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_readdirp_cbk+0xb2) [0x7f1ddc7a83ed] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_is_subvol_empty+0x111) [0x7f1ddc7a7c77]))) 0-dict: !this || key=trusted.glusterfs.dht.linkto
[2011-12-29 04:06:54.943810] I [dict.c:336:dict_get] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/client.so(client3_1_readdirp_cbk+0x4aa) [0x7f1ddca0a597] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_readdirp_cbk+0xb2) [0x7f1ddc7a83ed] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_is_subvol_empty+0x111) [0x7f1ddc7a7c77]))) 0-dict: !this || key=trusted.glusterfs.dht.linkto
[2011-12-29 04:06:54.958242] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:06:54.958295] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4892512: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)
[2011-12-29 04:06:54.965630] I [dict.c:336:dict_get] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/client.so(client3_1_readdirp_cbk+0x4aa) [0x7f1ddca0a597] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_readdirp_cbk+0xb2) [0x7f1ddc7a83ed] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_is_subvol_empty+0x111) [0x7f1ddc7a7c77]))) 0-dict: !this || key=trusted.glusterfs.dht.linkto
[2011-12-29 04:07:57.326074] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:07:57.326135] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4892677: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)
[2011-12-29 04:13:39.803810] I [dict.c:336:dict_get] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/client.so(client3_1_readdirp_cbk+0x4aa) [0x7f1ddca0a597] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_readdirp_cbk+0xb2) [0x7f1ddc7a83ed] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_is_subvol_empty+0x111) [0x7f1ddc7a7c77]))) 0-dict: !this || key=trusted.glusterfs.dht.linkto
[2011-12-29 04:13:39.813554] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:13:39.813613] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4892767: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)
[2011-12-29 04:13:39.821256] I [dict.c:336:dict_get] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/client.so(client3_1_readdirp_cbk+0x4aa) [0x7f1ddca0a597] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_readdirp_cbk+0xb2) [0x7f1ddc7a83ed] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/cluster/distribute.so(dht_rmdir_is_subvol_empty+0x111) [0x7f1ddc7a7c77]))) 0-dict: !this || key=trusted.glusterfs.dht.linkto
[2011-12-29 04:14:53.625024] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:14:53.625091] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4892830: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)
[2011-12-29 04:15:05.313091] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:15:05.313146] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4892843: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)
[2011-12-29 04:15:16.787065] W [client3_1-fops.c:509:client3_1_unlink_cbk] 0-hosdu-client-0: remote operation failed: No such file or directory
[2011-12-29 04:15:16.787114] W [fuse-bridge.c:1068:fuse_unlink_cbk] 0-glusterfs-fuse: 4892856: UNLINK() /run1994/pc/d0/d2/f6 => -1 (No such file or directory)


Following is the entries in brick log


[2011-12-29 01:00:33.126312] E [server.c:142:server_submit_reply] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/features/marker.so(marker_rename_cbk+0x9dd) [0x7f6dee7f34cc] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/debug/io-stats.so(io_stats_rename_cbk+0x37b) [0x7f6dee5c8f13] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(server_rename_cbk+0x480) [0x7f6dee38bcab]))) 0-: Reply submission failed
[2011-12-29 01:00:33.126321] I [server-helpers.c:485:do_fd_cleanup] 0-hosdu-server: fd cleanup on /run1994/pe
[2011-12-29 01:00:33.126383] W [inode.c:1031:__inode_path] 0-/tmp/brick/inode: no dentry for non-root inode : ea925285-c79d-43d4-8b2f-7d4b25d3c883
[2011-12-29 01:00:33.126407] I [server-helpers.c:491:do_fd_cleanup] 0-hosdu-server: fd cleanup on inode with gfid ea925285-c79d-43d4-8b2f-7d4b25d3c883
[2011-12-29 01:00:33.126433] I [server.c:426:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.24:1020
[2011-12-29 01:00:33.126475] I [server-helpers.c:775:server_connection_destroy] 0-hosdu-server: destroyed connection of client4-28285-2011/12/28-23:56:21:897002-hosdu-client-1
[2011-12-29 01:00:33.275829] E [server-helpers.c:864:server_alloc_frame] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x5a6) [0x7f6df2d45705] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(server_mkdir+0x15a) [0x7f6dee39e7cb] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(get_frame_from_request+0xf4) [0x7f6dee38338e]))) 0-server: invalid argument: conn
[2011-12-29 01:00:33.275888] W [rpcsvc.c:1093:rpcsvc_error_reply] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x19f) [0x7f6df2d4f582] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_notify+0x200) [0x7f6df2d45ccb] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x5df) [0x7f6df2d4573e]))) 0-: sending a RPC error reply
[2011-12-29 01:00:33.279293] I [server-handshake.c:540:server_setvolume] 0-hosdu-server: accepted client from 10.1.10.24:1013 (version: 3.3.0qa18)
[2011-12-29 01:00:33.322992] E [rdma.c:4528:gf_rdma_event_handler] 0-rpc-transport/rdma: rdma.hosdu-server: pollin received on tcp socket (peer: 10.1.10.24:1013) after handshake is complete
[2011-12-29 01:00:33.323360] I [server.c:426:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.24:1013
[2011-12-29 01:00:33.323422] I [server-helpers.c:775:server_connection_destroy] 0-hosdu-server: destroyed connection of client4-28285-2011/12/28-23:56:21:897002-hosdu-client-1
[2011-12-29 01:00:36.663992] E [server-helpers.c:864:server_alloc_frame] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x5a6) [0x7f6df2d45705] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(server_stat+0x115) [0x7f6dee399361] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(get_frame_from_request+0xf4) [0x7f6dee38338e]))) 0-server: invalid argument: conn
[2011-12-29 01:00:36.664049] W [rpcsvc.c:1093:rpcsvc_error_reply] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x19f) [0x7f6df2d4f582] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_notify+0x200) [0x7f6df2d45ccb] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x5df) [0x7f6df2d4573e]))) 0-: sending a RPC error reply
[2011-12-29 01:00:36.667394] I [server-handshake.c:540:server_setvolume] 0-hosdu-server: accepted client from 10.1.10.24:1004 (version: 3.3.0qa18)
[2011-12-29 01:00:36.699056] I [server3_1-fops.c:649:server_opendir_cbk] 0-hosdu-server: 803152: OPENDIR /run1994/p9/d4 (--) ==> -1 (No such file or directory)
[2011-12-29 01:00:36.700490] I [server3_1-fops.c:649:server_opendir_cbk] 0-hosdu-server: 803153: OPENDIR /run1994/p9/d4 (--) ==> -1 (No such file or directory)
[2011-12-29 01:00:36.868929] E [rdma.c:4528:gf_rdma_event_handler] 0-rpc-transport/rdma: rdma.hosdu-server: pollin received on tcp socket (peer: 10.1.10.24:1004) after handshake is complete
[2011-12-29 01:00:36.869305] I [server-helpers.c:485:do_fd_cleanup] 0-hosdu-server: fd cleanup on /run1994/p11
[2011-12-29 01:00:36.869344] I [server-helpers.c:485:do_fd_cleanup] 0-hosdu-server: fd cleanup on /run1994/p8/d0/d3
[2011-12-29 01:00:36.869377] I [server-helpers.c:485:do_fd_cleanup] 0-hosdu-server: fd cleanup on /run1994/p1/f1
[2011-12-29 01:00:36.869406] I [server.c:426:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.24:1004
[2011-12-29 01:00:36.869792] E [rpcsvc.c:1060:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x803661x, Program: GlusterFS 3.3.0qa18, ProgVers: 310, Proc: 40) to rpc-transport (rdma.hosdu-server)
[2011-12-29 01:00:36.869867] E [server.c:142:server_submit_reply] (-->/usr/local/lib/libglusterfs.so.0(default_readdirp_cbk+0x1b0) [0x7f6df2f91f0f] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/debug/io-stats.so(io_stats_readdirp_cbk+0x475) [0x7f6dee5c7ce4] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(server_readdirp_cbk+0x2a9) [0x7f6dee390098]))) 0-: Reply submission failed
[2011-12-29 01:00:36.869908] I [server-helpers.c:775:server_connection_destroy] 0-hosdu-server: destroyed connection of client4-28285-2011/12/28-23:56:21:897002-hosdu-client-1
[2011-12-29 01:00:40.649185] E [server-helpers.c:864:server_alloc_frame] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x5a6) [0x7f6df2d45705] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(server_lookup+0x258) [0x7f6dee3a131d] (-->/usr/local/lib/glusterfs/3.3.0qa18/xlator/protocol/server.so(get_frame_from_request+0xf4) [0x7f6dee38338e]))) 0-server: invalid argument: conn
[2011-12-29 01:00:40.649244] W [rpcsvc.c:1093:rpcsvc_error_reply] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x19f) [0x7f6df2d4f582] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_notify+0x200) [0x7f6df2d45ccb] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x5df) [0x7f6df2d4573e]))) 0-: sending a RPC error reply
[2011-12-29 01:00:40.656170] I [server-handshake.c:540:server_setvolume] 0-hosdu-server: accepted client from 10.1.10.24:997 (version: 3.3.0qa18)
[2011-12-29 01:36:34.911305] E [rdma.c:4528:gf_rdma_event_handler] 0-rpc-transport/rdma: rdma.hosdu-server: pollin received on tcp socket (peer: 10.1.10.24:997) after handshake is complete
[2011-12-29 01:36:34.912457] I [server.c:426:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.24:997
[2011-12-29 01:36:34.912534] I [server-helpers.c:775:server_connection_destroy] 0-hosdu-server: destroyed connection of client4-28285-2011/12/28-23:56:21:897002-hosdu-client-1
[2011-12-29 01:40:17.192916] I [server-handshake.c:540:server_setvolume] 0-hosdu-server: accepted client from 10.1.10.24:1014 (version: 3.3.0qa18)


I have archived all the logs.

--- Additional comment from amarts on 2012-02-27 05:35:47 EST ---

This is the priority for immediate future (before 3.3.0 GA release). Will bump the priority up once we take RDMA related tasks.

Comment 2 Amar Tumballi 2012-08-23 06:44:57 UTC
This bug is not seen in current master branch (which will get branched as RHS 2.1.0 soon). To consider it for fixing, want to make sure this bug still exists in RHS servers. If not reproduced, would like to close this.

Comment 4 Sachidananda Urs 2013-08-08 05:44:51 UTC
Moving out of Big Bend since RDMA support is not available in Big Bend,2.1