Server is running on port 24010. A client mount strangely defaults to 24008 all the time until for a while. While during this time i forced remote-port to 24010 by writing a new volfile. After a umount and a remount fetching the file from server made client start connecting on 24010. Seems like RDMA internal RPC exchange causes a dummy port to be listened. So after the first RPC exchange mount starts working.
glusterd rdma transport listens on 24008. Clients first connect to glusterd (through port 24008), fetch the volfile and then connect to the server exporting appropriate brick.
(In reply to comment #1) > glusterd rdma transport listens on 24008. Clients first connect to glusterd > (through port 24008), fetch the volfile and then connect to the server > exporting appropriate brick. But i was receiving 'Connection Refused' does this mean that the 'glusterd' started over RDMA failed in some ways? I have seen that many times Now.
> But i was receiving 'Connection Refused' does this mean that the 'glusterd' > started over RDMA failed in some ways? > Was glusterd started before the ib modules were loaded?
(In reply to comment #3) > > But i was receiving 'Connection Refused' does this mean that the 'glusterd' > > started over RDMA failed in some ways? > > > > Was glusterd started before the ib modules were loaded? glusterd was installed like a day later after infiniband was configured, this is CentOS 6.0.
If rdma is present in the machine while starting glusterd, then it should be listening on 24008. Can you confirm the port is open for listening by 'netstat -ntlp' ?
(In reply to comment #5) > If rdma is present in the machine while starting glusterd, then it should be > listening on 24008. Can you confirm the port is open for listening by 'netstat > -ntlp' ? From what i remember starting glusterd never showed on netstat 24008, so i had to restart it 2-3 times. Then i wrote a new vol file just to connect to the server process from client by specifying remote-port. After that i umounted and fetched again from server. This time the client connected to the volume. We have seen this at repetitive occurrences on couple of customer sites. Hopefully it is reproducible in our labs.
Will try to reproduce in our labs and update you. But would take some time as we have demand for machines with IB.
http://review.gluster.org/4323 should fix this...
I am having very similar issue using rdma only transport with 3.3.1. I've tried downgrading to 3.3.0 and the problem goes away. Using CentOS 6.3 on clients and Ubuntu 12.04 on file servers.