Bug 763792 (GLUSTER-2060)

Summary: Infiniteband errors on large file find
Product: [Community] GlusterFS Reporter: Allen Lu <allen>
Component: ib-verbsAssignee: Raghavendra G <raghavendra>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: 3.1.0CC: anush, gluster-bugs, mrobbert, rahulcs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: fuse
Documentation: DNR CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Client logs from test case
none
Logs from brick1
none
Logs from brick2
none
Logs from brick3
none
Logs from brick4 none

Description Mike Robbert 2010-11-05 19:08:15 UTC
The initial report was based on a directory with over 1 million files which I realize is a little out of the ordinary, but we will need to handle that. In the mean time a more reasonable case to work with is an 'ls -R' on the source directory for OpenMPI, probably hundreds of files spread out among many subdirectories. I noticed that I was having some problems so I restarted all the servers and 1 client with log-level=DEBUG. I then started the 'ls -R' and it is still running after about 50 minutes of running. Every couple of minutes it outputs a handful of files. I'll attach log files shortly, but to summarize there is lots of log errors on the client, but only a handful of disconnect and reconnect messages on the servers.

The setup is that all servers are running on a build from git on or shortly before Oct 28 and all clients are running the binary from RPM 3.1.1qa1 released around the same time. There are 4 servers and the volume is distributed and using rdma. The backend storage is DDN 9550 controllers with disks in an 8+2 RAID.

Comment 1 Mike Robbert 2010-11-05 19:21:49 UTC
Created attachment 368

Comment 2 Mike Robbert 2010-11-05 19:24:54 UTC
Created attachment 369

Comment 3 Mike Robbert 2010-11-05 19:26:17 UTC
Created attachment 370

Comment 4 Mike Robbert 2010-11-05 19:26:39 UTC
Created attachment 371

Comment 5 Mike Robbert 2010-11-05 19:26:57 UTC
Created attachment 372

Comment 6 Mike Robbert 2010-11-05 19:47:47 UTC
One final comment before I leave for the weekend. At first I thought that the errors I was seeing may have been due to the fact that the directory I was looking at had been untouched on the servers since I was running 3.0.5 (which by the way ran quite well on the same hardware) and I've done lots of changes and tests to the server code and maybe there was some kind of hash change since then. so, on a whim I decided to try to untar a similar directory, openmpi-1.4.2.tar.bz2. I started the process and after a couple of files it appeared to hang for a minute or so and similar messages showed up in the logs before it did a few more files. so, writing new files to the volume works no better than reading.

Comment 7 Allen Lu 2010-11-05 21:37:49 UTC
Filing on behalf of Michael who will research more to get reproducible steps.

Today I did a test that failed and produced lots of log output. I tried to do a find on a large directory in order to produce a file list of that directory. I have a copy of this directory on Lustre and a find there took 22 seconds. When I did it on Gluster it sat there for a minute or two then exited with an error after producing no output.

[root@compute-8-27 imfergus]# find wy1980/ > /tmp/wy1980.filelist
find: wy1980/: Invalid argument

Here are the logs:
[2010-11-03 12:21:17.302496] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-0: Server 172.16.2.4:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:21:17.303459] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-0: pollin received on tcp socket (peer: 172.16.2.4:24009) after handshake is complete
[2010-11-03 12:21:17.304020] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(OPENDIR(20)) called at 2010-11-03 12:20:14.912201
[2010-11-03 12:21:17.304085] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:20:35.261201
[2010-11-03 12:21:17.315900] E [client3_1-fops.c:1857:client3_1_lookup_cbk] : error
[2010-11-03 12:21:17.315927] I [dht-common.c:369:dht_revalidate_cbk] gluster-test-dht: subvolume gluster-test-client-0 returned -1 (Invalid argument)
[2010-11-03 12:21:17.315971] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-0: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:21:17.316147] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-0: Connected to 172.16.2.4:24009, attached to remote volume '/mnt/brick1'.
[2010-11-03 12:22:41.389021] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-2: Server 172.16.2.6:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:22:41.389132] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-3: Server 172.16.2.7:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:22:41.389163] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-1: Server 172.16.2.5:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:22:41.389211] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-1: pollin received on tcp socket (peer: 172.16.2.5:24009) after handshake is complete
[2010-11-03 12:22:41.389565] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2010-11-03 12:21:17.315689
[2010-11-03 12:22:41.389665] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:21:59.346239
[2010-11-03 12:22:41.389688] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-3: pollin received on tcp socket (peer: 172.16.2.7:24009) after handshake is complete
[2010-11-03 12:22:41.389920] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2010-11-03 12:21:17.312567
[2010-11-03 12:22:41.389963] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:21:59.346230
[2010-11-03 12:22:41.389982] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-2: pollin received on tcp socket (peer: 172.16.2.6:24009) after handshake is complete
[2010-11-03 12:22:41.390209] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2010-11-03 12:21:17.312555
[2010-11-03 12:22:41.402188] I [dht-layout.c:588:dht_layout_normalize] gluster-test-dht: found anomalies in /imfergus/wy1980. holes=1 overlaps=0
[2010-11-03 12:22:41.402244] W [fuse-bridge.c:180:fuse_entry_cbk] glusterfs-fuse: 243: LOOKUP() /imfergus/wy1980 => -1 (Stale NFS file handle)
[2010-11-03 12:22:41.402385] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:21:59.346216
[2010-11-03 12:22:41.404245] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-2: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:22:41.404517] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-2: Connected to 172.16.2.6:24009, attached to remote volume '/mnt/brick3'.
[2010-11-03 12:22:41.404547] I [client-handshake.c:698:client_post_handshake] gluster-test-client-2: 1 fds open - Delaying child_up until they are re-opened
[2010-11-03 12:22:41.405754] E [client3_1-fops.c:1857:client3_1_lookup_cbk] : error
[2010-11-03 12:22:41.405996] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-1: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:22:41.406059] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-3: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:22:41.406131] E [client3_1-fops.c:1857:client3_1_lookup_cbk] : error
[2010-11-03 12:22:41.406152] W [fuse-bridge.c:180:fuse_entry_cbk] glusterfs-fuse: 245: LOOKUP() /imfergus/wy1980 => -1 (Invalid argument)
[2010-11-03 12:22:41.406211] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-1: Connected to 172.16.2.5:24009, attached to remote volume '/mnt/brick2'.
[2010-11-03 12:22:41.406244] I [client-handshake.c:698:client_post_handshake] gluster-test-client-1: 1 fds open - Delaying child_up until they are re-opened
[2010-11-03 12:22:41.406298] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-3: Connected to 172.16.2.7:24009, attached to remote volume '/mnt/brick4'.
[2010-11-03 12:22:41.406313] I [client-handshake.c:698:client_post_handshake] gluster-test-client-3: 1 fds open - Delaying child_up until they are re-opened

Any thoughts on what is going on? ls and cd on all directories above this are working fine today. There are no logs on the servers.

Mike

Comment 8 Raghavendra G 2010-11-09 02:02:43 UTC
*** Bug 2043 has been marked as a duplicate of this bug. ***

Comment 9 Anand Avati 2010-11-09 07:12:14 UTC
PATCH: http://patches.gluster.com/patch/5656 in master (rpc-transport/rdma: set the default quota of recv buffers to 128.)

Comment 10 Raghavendra G 2010-11-14 02:30:28 UTC
Hi Michael/Allen,

Can you check whether this patch fixes the issue? We are not able to reproduce the issue locally.

regards,
Raghavendra