Description of problem: I am consistently seeing the following errors when starting a volume for the first time: [2013-01-11 18:18:48.679691] I [rpc-clnt.c:1659:rpc_clnt_reconfig] 0-DISTRIBUTED-client-0: changing port to 24009 (fro m 0) [2013-01-11 18:18:49.258045] E [client-handshake.c:1695:client_query_portmap_cbk] 0-DISTRIBUTED-client-1: failed to ge t the port number for remote subvolume [2013-01-11 18:18:49.258087] I [client.c:2098:client_rpc_notify] 0-DISTRIBUTED-client-1: disconnected Version-Release number of selected component (if applicable): glusterfs-3.3.0.5rhs-40.el6rhs.x86_64 How reproducible: I consistently see this when starting a volume for the first time after creating a cluster. Steps to Reproduce: 1. Create new cluster. 2. Peer Probe 3. Start Volume. 4. Mount over NFS. Actual results: Everything mounts and works properly, but the error message shows in logs. Expected results: No errors. Additional info:
This is possible because of the race between starting of the brick process and NFS server process. (they are started in parallel when a 'gluster volume start' is issued) This race won't exist in RHS 2.1 (glusterfs-3.4.x + versions). Not a proper work around exists today. Raghavendra Talur, Please confirm the behavior on 3.4.0 branch and update the bug.
I can confirm that this doesn't happen on 3.4 branch. I tested with glusterfs-3.4.0qa5-1.el6rhs.x86_64.
Any thoughts on what should be the status of this issue? I guess we should close it as WORKSFORME, as we decided 2.0.z would have only critical fixes after update 4. Will be closing this as WORKSFORME, and please re-open if this is a serious issue and we should include in 2.0.z branch.