Bug 763792 (GLUSTER-2060) - Infiniteband errors on large file find
Summary: Infiniteband errors on large file find
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-2060
Product: GlusterFS
Classification: Community
Component: ib-verbs
Version: 3.1.0
Hardware: x86_64
OS: Linux
medium
low
Target Milestone: ---
Assignee: Raghavendra G
QA Contact:
URL:
Whiteboard:
: 763775 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-05 21:37 UTC by Allen Lu
Modified: 2015-12-01 16:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTNR
Mount Type: fuse
Documentation: DNR
CRM:
Verified Versions:


Attachments (Terms of Use)
Client logs from test case (397.92 KB, text/plain)
2010-11-05 19:21 UTC, Mike Robbert
no flags Details
Logs from brick1 (82.76 KB, text/plain)
2010-11-05 19:24 UTC, Mike Robbert
no flags Details
Logs from brick2 (80.87 KB, text/plain)
2010-11-05 19:26 UTC, Mike Robbert
no flags Details
Logs from brick3 (77.64 KB, text/plain)
2010-11-05 19:26 UTC, Mike Robbert
no flags Details
Logs from brick4 (74.28 KB, text/plain)
2010-11-05 19:26 UTC, Mike Robbert
no flags Details

Description Mike Robbert 2010-11-05 19:08:15 UTC
The initial report was based on a directory with over 1 million files which I realize is a little out of the ordinary, but we will need to handle that. In the mean time a more reasonable case to work with is an 'ls -R' on the source directory for OpenMPI, probably hundreds of files spread out among many subdirectories. I noticed that I was having some problems so I restarted all the servers and 1 client with log-level=DEBUG. I then started the 'ls -R' and it is still running after about 50 minutes of running. Every couple of minutes it outputs a handful of files. I'll attach log files shortly, but to summarize there is lots of log errors on the client, but only a handful of disconnect and reconnect messages on the servers.

The setup is that all servers are running on a build from git on or shortly before Oct 28 and all clients are running the binary from RPM 3.1.1qa1 released around the same time. There are 4 servers and the volume is distributed and using rdma. The backend storage is DDN 9550 controllers with disks in an 8+2 RAID.

Comment 1 Mike Robbert 2010-11-05 19:21:49 UTC
Created attachment 368

Comment 2 Mike Robbert 2010-11-05 19:24:54 UTC
Created attachment 369

Comment 3 Mike Robbert 2010-11-05 19:26:17 UTC
Created attachment 370

Comment 4 Mike Robbert 2010-11-05 19:26:39 UTC
Created attachment 371

Comment 5 Mike Robbert 2010-11-05 19:26:57 UTC
Created attachment 372

Comment 6 Mike Robbert 2010-11-05 19:47:47 UTC
One final comment before I leave for the weekend. At first I thought that the errors I was seeing may have been due to the fact that the directory I was looking at had been untouched on the servers since I was running 3.0.5 (which by the way ran quite well on the same hardware) and I've done lots of changes and tests to the server code and maybe there was some kind of hash change since then. so, on a whim I decided to try to untar a similar directory, openmpi-1.4.2.tar.bz2. I started the process and after a couple of files it appeared to hang for a minute or so and similar messages showed up in the logs before it did a few more files. so, writing new files to the volume works no better than reading.

Comment 7 Allen Lu 2010-11-05 21:37:49 UTC
Filing on behalf of Michael who will research more to get reproducible steps.

Today I did a test that failed and produced lots of log output. I tried to do a find on a large directory in order to produce a file list of that directory. I have a copy of this directory on Lustre and a find there took 22 seconds. When I did it on Gluster it sat there for a minute or two then exited with an error after producing no output.

[root@compute-8-27 imfergus]# find wy1980/ > /tmp/wy1980.filelist
find: wy1980/: Invalid argument

Here are the logs:
[2010-11-03 12:21:17.302496] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-0: Server 172.16.2.4:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:21:17.303459] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-0: pollin received on tcp socket (peer: 172.16.2.4:24009) after handshake is complete
[2010-11-03 12:21:17.304020] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(OPENDIR(20)) called at 2010-11-03 12:20:14.912201
[2010-11-03 12:21:17.304085] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:20:35.261201
[2010-11-03 12:21:17.315900] E [client3_1-fops.c:1857:client3_1_lookup_cbk] : error
[2010-11-03 12:21:17.315927] I [dht-common.c:369:dht_revalidate_cbk] gluster-test-dht: subvolume gluster-test-client-0 returned -1 (Invalid argument)
[2010-11-03 12:21:17.315971] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-0: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:21:17.316147] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-0: Connected to 172.16.2.4:24009, attached to remote volume '/mnt/brick1'.
[2010-11-03 12:22:41.389021] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-2: Server 172.16.2.6:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:22:41.389132] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-3: Server 172.16.2.7:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:22:41.389163] E [client-handshake.c:116:rpc_client_ping_timer_expired] gluster-test-client-1: Server 172.16.2.5:24009 has not responded in the last 42 seconds, disconnecting.
[2010-11-03 12:22:41.389211] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-1: pollin received on tcp socket (peer: 172.16.2.5:24009) after handshake is complete
[2010-11-03 12:22:41.389565] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2010-11-03 12:21:17.315689
[2010-11-03 12:22:41.389665] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:21:59.346239
[2010-11-03 12:22:41.389688] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-3: pollin received on tcp socket (peer: 172.16.2.7:24009) after handshake is complete
[2010-11-03 12:22:41.389920] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2010-11-03 12:21:17.312567
[2010-11-03 12:22:41.389963] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:21:59.346230
[2010-11-03 12:22:41.389982] E [rdma.c:4443:rdma_event_handler] rpc-transport/rdma: gluster-test-client-2: pollin received on tcp socket (peer: 172.16.2.6:24009) after handshake is complete
[2010-11-03 12:22:41.390209] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2010-11-03 12:21:17.312555
[2010-11-03 12:22:41.402188] I [dht-layout.c:588:dht_layout_normalize] gluster-test-dht: found anomalies in /imfergus/wy1980. holes=1 overlaps=0
[2010-11-03 12:22:41.402244] W [fuse-bridge.c:180:fuse_entry_cbk] glusterfs-fuse: 243: LOOKUP() /imfergus/wy1980 => -1 (Stale NFS file handle)
[2010-11-03 12:22:41.402385] E [rpc-clnt.c:338:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb9) [0x2b42825bb689] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e) [0x2b42825bae2e] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b42825bad9e]))) rpc-clnt: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2010-11-03 12:21:59.346216
[2010-11-03 12:22:41.404245] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-2: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:22:41.404517] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-2: Connected to 172.16.2.6:24009, attached to remote volume '/mnt/brick3'.
[2010-11-03 12:22:41.404547] I [client-handshake.c:698:client_post_handshake] gluster-test-client-2: 1 fds open - Delaying child_up until they are re-opened
[2010-11-03 12:22:41.405754] E [client3_1-fops.c:1857:client3_1_lookup_cbk] : error
[2010-11-03 12:22:41.405996] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-1: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:22:41.406059] I [client-handshake.c:993:select_server_supported_programs] gluster-test-client-3: Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2010-11-03 12:22:41.406131] E [client3_1-fops.c:1857:client3_1_lookup_cbk] : error
[2010-11-03 12:22:41.406152] W [fuse-bridge.c:180:fuse_entry_cbk] glusterfs-fuse: 245: LOOKUP() /imfergus/wy1980 => -1 (Invalid argument)
[2010-11-03 12:22:41.406211] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-1: Connected to 172.16.2.5:24009, attached to remote volume '/mnt/brick2'.
[2010-11-03 12:22:41.406244] I [client-handshake.c:698:client_post_handshake] gluster-test-client-1: 1 fds open - Delaying child_up until they are re-opened
[2010-11-03 12:22:41.406298] I [client-handshake.c:829:client_setvolume_cbk] gluster-test-client-3: Connected to 172.16.2.7:24009, attached to remote volume '/mnt/brick4'.
[2010-11-03 12:22:41.406313] I [client-handshake.c:698:client_post_handshake] gluster-test-client-3: 1 fds open - Delaying child_up until they are re-opened

Any thoughts on what is going on? ls and cd on all directories above this are working fine today. There are no logs on the servers.

Mike

Comment 8 Raghavendra G 2010-11-09 02:02:43 UTC
*** Bug 2043 has been marked as a duplicate of this bug. ***

Comment 9 Anand Avati 2010-11-09 07:12:14 UTC
PATCH: http://patches.gluster.com/patch/5656 in master (rpc-transport/rdma: set the default quota of recv buffers to 128.)

Comment 10 Raghavendra G 2010-11-14 02:30:28 UTC
Hi Michael/Allen,

Can you check whether this patch fixes the issue? We are not able to reproduce the issue locally.

regards,
Raghavendra


Note You need to log in before you can comment on or make changes to this bug.