Bug 907695

Summary: rdma does not select the correct port when mounting volume
Product: [Community] GlusterFS Reporter: Andrei Mikhailovsky <andrei>
Component: rdmaAssignee: Raghavendra G <rgowdapp>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: gluster-bugs, joe, mozes
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-06 02:54:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrei Mikhailovsky 2013-02-05 02:40:29 UTC
Description of problem:

The client doesn't receive the correct rdma transport port number when mounting a volume. Following the successfull receipt of the volume file I get the following in the logs:


[2013-02-05 01:33:46.424512] D [glusterfsd-mgmt.c:2116:glusterfs_mgmt_pmap_signin] 0-fsd-mgmt: portmapper signin arguments not given
[2013-02-05 01:33:46.424564] E [rdma.c:4604:tcp_connect_finish] 0-cloudstack-primary-client-0: tcp connect to  failed (Connection refused)
[2013-02-05 01:33:46.424820] W [rdma.c:4187:gf_rdma_disconnect] (-->/usr/sbin/glusterfs(main+0x58a) [0x40741a] (-->/usr/lib64/libglusterfs.so.0() [0x339203ed14] (-->/usr/lib64/glusterfs/3.3.1/rpc-transport/rdma.so(+0x8210) [0x7f85949e5210]))) 0-cloudstack-primary-client-0: disconnect called (peer:)
[2013-02-05 01:33:46.424871] W [rdma.c:4521:gf_rdma_handshake_pollerr] (-->/usr/sbin/glusterfs(main+0x58a) [0x40741a] (-->/usr/lib64/libglusterfs.so.0() [0x339203ed14] (-->/usr/lib64/glusterfs/3.3.1/rpc-transport/rdma.so(+0x80f8) [0x7f85949e50f8]))) 0-rpc-transport/rdma: cloudstack-primary-client-0: peer () disconnected, cleaning up
[2013-02-05 01:33:46.424903] E [rdma.c:4604:tcp_connect_finish] 0-cloudstack-primary-client-1: tcp connect to  failed (Connection refused)
[2013-02-05 01:33:46.424942] W [rdma.c:4187:gf_rdma_disconnect] (-->/usr/sbin/glusterfs(main+0x58a) [0x40741a] (-->/usr/lib64/libglusterfs.so.0() [0x339203ed14] (-->/usr/lib64/glusterfs/3.3.1/rpc-transport/rdma.so(+0x8210) [0x7f85949e5210]))) 0-cloudstack-primary-client-1: disconnect called (peer:)
[2013-02-05 01:33:46.424981] W [rdma.c:4521:gf_rdma_handshake_pollerr] (-->/usr/sbin/glusterfs(main+0x58a) [0x40741a] (-->/usr/lib64/libglusterfs.so.0() [0x339203ed14] (-->/usr/lib64/glusterfs/3.3.1/rpc-transport/rdma.so(+0x80f8) [0x7f85949e50f8]))) 0-rpc-transport/rdma: cloudstack-primary-client-1: peer () disconnected, cleaning up
[2013-02-05 01:33:50.306058] D [name.c:137:client_fill_address_family] 0-cloudstack-primary-client-0: address-family not specified, guessing it to be inet/inet6
[2013-02-05 01:33:50.306130] D [name.c:208:af_inet_client_get_remote_sockaddr] 0-cloudstack-primary-client-0: option remote-port missing in volume cloudstack-primary-client-0. Defaulting to 24008
[2013-02-05 01:33:50.310049] D [common-utils.c:151:gf_resolve_ip6] 0-resolver: returning ip-192.168.168.200 (port-24008) for hostname: arh-ibstorage-ib and port: 24008
[2013-02-05 01:33:50.310198] D [name.c:137:client_fill_address_family] 0-cloudstack-primary-client-1: address-family not specified, guessing it to be inet/inet6
[2013-02-05 01:33:50.310213] D [name.c:208:af_inet_client_get_remote_sockaddr] 0-cloudstack-primary-client-1: option remote-port missing in volume cloudstack-primary-client-1. Defaulting to 24008
[2013-02-05 01:33:50.310332] E [rdma.c:4604:tcp_connect_finish] 0-cloudstack-primary-client-0: tcp connect to  failed (Connection refused)
[2013-02-05 01:33:50.310456] W [rdma.c:4187:gf_rdma_disconnect] (-->/usr/sbin/glusterfs(main+0x58a) [0x40741a] (-->/usr/lib64/libglusterfs.so.0() [0x339203ed14] (-->/usr/lib64/glusterfs/3.3.1/rpc-transport/rdma.so(+0x8210) [0x7f85949e5210]))) 0-cloudstack-primary-client-0: disconnect called (peer:)
[2013-02-05 01:33:50.310501] W [rdma.c:4521:gf_rdma_handshake_pollerr] (-->/usr/sbin/glusterfs(main+0x58a) [0x40741a] (-->/usr/lib64/libglusterfs.so.0() [0x339203ed14] (-->/usr/lib64/glusterfs/3.3.1/rpc-transport/rdma.so(+0x80f8) [0x7f85949e50f8]))) 0-rpc-transport/rdma: cloudstack-primary-client-0: peer () disconnected, cleaning up
[2013-02-05 01:33:50.314594] D [common-utils.c:151:gf_resolve_ip6] 0-resolver: returning ip-192.168.168.201 (port-24008) for hostname: arh-ibstorage2-ib and port: 24008
[2013-02-05 01:33:50.314743] E [rdma.c:4604:tcp_connect_finish] 0-cloudstack-primary-client-1: tcp connect to  failed (Connection refused)



Version-Release number of selected component (if applicable):
3.3.1

How reproducible:

Steps to Reproduce:
1. create a volume using rdma transport 
2. try to mount it from the client
3. Not sure if using tcp,rdma transport produces the same problem
  
Actual results:
Mounting hangs and doesn't finish. The client is attempting to contact the server on port 24008 which is not open if using rdma only transport type

Expected results:
mount should successfully finish

Additional info:
Clients running Centos 6.3 with latest updates and glusterfs 3.3.1. Servers are running Ubuntu 12.04 with latest updates and glusterfs version 3.3.1 from the ppa.

Comment 1 Andrei Mikhailovsky 2013-02-05 02:43:29 UTC
Didn't mention that version 3.3.0 doesn't have this problem and mounting over rdma transport is working.

Comment 2 Joe Julian 2013-02-05 18:14:54 UTC
By adding the remote-port option to the client translator on the vol file, we were able to get his clients operational.

This is a severe issue which causes a total failure of the software and should be urgent priority.

Please fix this for the release-3.3 branch. I don't have test hardware to determine if this is a problem in master.

Comment 3 Raghavendra G 2013-02-06 02:54:21 UTC

*** This bug has been marked as a duplicate of bug 849122 ***