+++ This bug was initially created as a clone of Bug #764924 +++ Hi, I see that the version here is '3.2.1', Is upgrading an option? GlusterFS's support to RDMA became more complete with version 3.2.2, which got released last week. Please check the behavior with new version and do let us know. --- Additional comment from jpenney on 2011-07-19 07:21:23 EDT --- (In reply to comment #1) > Hi, I see that the version here is '3.2.1', Is upgrading an option? GlusterFS's > support to RDMA became more complete with version 3.2.2, which got released > last week. Please check the behavior with new version and do let us know. The 3.2.2 upgrade did not change this behavior. BTW, 3.2.2 is not a choice in the Bugzilla menu. --- Additional comment from amarts on 2011-07-19 07:49:03 EDT --- > > The 3.2.2 upgrade did not change this behavior. Will take a look on this. > > BTW, 3.2.2 is not a choice in the Bugzilla menu. Added it to the Versions now. --- Additional comment from jpenney on 2011-07-19 10:14:14 EDT --- ===== Symptoms ===== - Error from test script (included below): "Could not open write: out.node02.15: Invalid argument" - Client write failure when multiple clients are reading and writing while using the RDMA transport. - High memory usage on one Gluster server. In this case node06. Output is from 'top' command. - node06: 14850 root 16 0 23.1g 17g 1956 S 120.9 56.6 8:38.78 glusterfsd - node05: 12633 root 16 0 418m 157m 1852 S 0.0 0.5 2:56.02 glusterfsd - node04: 21066 root 15 0 355m 151m 1852 S 0.0 0.6 1:07.71 glusterfsd - Temporary work around by using IPoIB instead of RDMA - May take 10 - 15 minutes for first failure. ===== Version Information ===== - CentOS 5.6 kernel 2.6.18-238.9.1.el5 - OFED 1.5.3.1 - Gluster 3.2.1 RPMs - Ext3 filesystem ===== Roles of nodes ===== node04, node05, node06 - Gluster servers. node01, node02, node03 - Clients. Mount node04:/gluster-vol01 on /gluster and run the test script in /gluster/test ===== Gluster Volume Info. ===== Volume Name: gluster-vol01 Type: Distribute Status: Started Number of Bricks: 3 Transport-type: rdma Bricks: Brick1: node04:/gluster-raw-storage Brick2: node05:/gluster-raw-storage Brick3: node06:/gluster-raw-storage ===== Gluster peer status. ===== Number of Peers: 2 Hostname: node06 Uuid: 2c5f66b3-ddc8-4811-bd45-f12c60a22891 State: Peer in Cluster (Connected) Hostname: node05 Uuid: 00b8d063-8d74-4ffe-9a44-c50e46eca78c State: Peer in Cluster (Connected) ===== Simple test script. Run in /gluster/test. The files read.1-4 are 2 GB files created with dd if=/dev/zero of=read.4 bs=1024k count=2048 ===== #!/usr/bin/perl $| = 0; $/ = undef; $hostname = $ENV{HOSTNAME}; my $i = 1; my $x = 1; while ($i) { if ($i >= 5) { $i = 1 } print "Read: read.$i\n"; open(FILE, "read.$i") || die "Could not open read.$i: $!\n"; my $string = <FILE>; close(FILE); open(OUT, ">out.$hostname.$x") || die "Could not open write: out.$hostname.$x: $!\n"; print OUT "This was read.$i\n"; close(OUT); $i++; $x++; } --- Additional comment from jpenney on 2011-08-25 08:46:29 EDT --- Has there been any progress on this report? Has the problem been able to be replicated? Our workaround is to use IPoIB and the TCP transport which does work as expected. --- Additional comment from amarts on 2012-02-27 05:35:45 EST --- This is the priority for immediate future (before 3.3.0 GA release). Will bump the priority up once we take RDMA related tasks.