Bug 852585 - [glusterfs-3.2.5qa2] - iozone fails in volume with tcp,rdma transport type
Summary: [glusterfs-3.2.5qa2] - iozone fails in volume with tcp,rdma transport type
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rdma
Version: 2.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On: GLUSTER-3758
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-29 01:31 UTC by Vidya Sakar
Modified: 2015-02-13 10:20 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: GLUSTER-3758
Environment:
Last Closed: 2015-02-13 10:20:57 UTC
Embargoed:


Attachments (Terms of Use)

Description Vidya Sakar 2012-08-29 01:31:40 UTC
+++ This bug was initially created as a clone of Bug #765490 +++

Created a replicate volume with tcp,rdma transport type. Mounted via nfs and started running iozone. It failed with following error. 

5 2947000  2748949  5306340  1512364  1565562 3413030  5184636
            2048     512 1419165 1542232  3127245  5044577 5238385 2230685 3419824  2691246  5212953  1647536  1841829 3292669  5119743
            2048    1024 1407538 1558461  3302797  5059433 5277002 2178635 2756890  2482062  4643695  1802788  2231264 2880782  4676561
            2048    2048fsync: Input/output error

iozone: interrupted 

exiting iozone

glusterd logs say that the glusterd is loaded with following volfile.

  1: volume management
  2:     type mgmt/glusterd
  3:     option working-directory /etc/glusterd
  4:     option transport-type socket,rdma
  5:     option transport.socket.keepalive-time 10
  6:     option transport.socket.keepalive-interval 2
  7: end-volume
  8:

I'm not sure whether 'socket,rdma' is a valid transport type. And I see following errors in logs.

[2011-10-24 23:59:02.28709] I [glusterd-handler.c:2781:glusterd_op_commit_send_resp] 0-glusterd: Responded to commit, ret: 0
[2011-10-24 23:59:02.29857] I [glusterd-handler.c:2690:glusterd_handle_cluster_unlock] 0-glusterd: Received UNLOCK from uuid: 0a25570c-4342-4187-a57b-ce61dd4dd73e
[2011-10-24 23:59:02.29903] I [glusterd-handler.c:2668:glusterd_op_unlock_send_resp] 0-glusterd: Responded to unlock, ret: 0
[2011-10-24 23:59:02.30441] E [rdma.c:4468:rdma_event_handler] 0-rpc-transport/rdma: rdma.management: pollin received on tcp socket (peer: 10.1.10.21:1017) after handshake is complete
[2011-10-24 23:59:02.593025] E [rdma.c:4468:rdma_event_handler] 0-rpc-transport/rdma: rdma.management: pollin received on tcp socket (peer: 10.1.10.24:1020) after handshake is complete
[2011-10-25 00:10:18.940995] I [glusterd-handler.c:448:glusterd_handle_cluster_lock] 0-glusterd: Received LOCK from uuid: 0a25570c-4342-4187-a57b-ce61dd4dd73e
[2011-10-25 00:10:18.941078] I [glusterd-utils.c:243:glusterd_lock] 0-glusterd: Cluster lock held by 0a25570c-4342-4187-a57b-ce61dd4dd73e
[2011-10-25 00:10:18.941139] I [glusterd-handler.c:2648:glusterd_op_lock_send_resp] 0-glusterd: Responded, ret: 0


I see the following entries in brick logs.

[2011-10-24 23:57:39.934419] I [server-handshake.c:542:server_setvolume] 0-hosdu-server: accepted client from 10.1.10.24:1019 (version: 3.2.5qa1)
[2011-10-24 23:57:43.190129] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1019
[2011-10-24 23:57:43.190184] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1019
[2011-10-24 23:57:46.193808] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:57:46.193856] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:57:55.996217] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:57:55.996286] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:05.606191] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:05.606234] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:14.617234] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:14.617278] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:23.628203] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:23.628248] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:32.639200] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:32.639242] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:41.650281] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:41.650326] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:50.661278] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:50.661396] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:58:59.672278] E [socket.c:1395:__socket_read_frag] 0-rpc: wrong MSG-TYPE (1598180427) received from 10.1.10.21:1022
[2011-10-24 23:58:59.672325] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.21:1022
[2011-10-24 23:59:00.959274] I [glusterfsd-mgmt.c:62:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2011-10-24 23:59:01.352547] E [rdma.c:4468:rdma_event_handler] 0-rpc-transport/rdma: rdma.hosdu-server: pollin received on tcp socket (peer: 10.1.10.24:1019) after handshake is complete
[2011-10-24 23:59:01.353181] I [server.c:438:server_rpc_notify] 0-hosdu-server: disconnected connection from 10.1.10.24:1019
[2011-10-24 23:59:01.353292] I [server-helpers.c:783:server_connection_destroy] 0-hosdu-server: destroyed connection of client4-5543-2011/10/24-23:57:35:247380-hosdu-client-1
[2011-10-24 23:59:02.30027] E [server.c:609:reconfigure] 0-hosdu-server: Reconfigure not found for transport
[2011-10-24 23:59:05.518636] I [server-handshake.c:542:server_setvolume] 0-hosdu-server: accepted client from 10.1.10.21:1020 (version: 3.2.5qa1)

--- Additional comment from raghavendra on 2011-11-16 22:44:59 EST ---

On RHEL5, iozone on nfs completes without any issues with rdma transport b/w nfs-server and gluster server.

# uname -a
Linux client13 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

This was b/w client13 and client12.

@MS,
Can you please confirm whether iozone works fine on NFS mounts on SSA?

regards,
Raghavendra.

--- Additional comment from amarts on 2012-02-27 05:36:03 EST ---

This is the priority for immediate future (before 3.3.0 GA release). Will bump the priority up once we take RDMA related tasks.

Comment 3 Amar Tumballi 2012-12-06 05:21:56 UTC
RDMA related issues have been reduced in priority for now. Will be working on them parallelly, but with medium priority.


Note You need to log in before you can comment on or make changes to this bug.