Bug 894556

Summary: Failed to get the port number for remote subvolume error when starting a volume for the first time.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ben Turner <bturner>
Component: glusterfsAssignee: Raghavendra Talur <rtalur>
Status: CLOSED WORKSFORME QA Contact: Ben Turner <bturner>
Severity: medium Docs Contact:
Priority: low    
Version: 2.0CC: amarts, grajaiya, jdarcy, rhs-bugs, sdharane, shaines, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-31 06:43:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Turner 2013-01-12 06:21:08 UTC
Description of problem:

I am consistently seeing the following errors when starting a volume for the first time:

[2013-01-11 18:18:48.679691] I [rpc-clnt.c:1659:rpc_clnt_reconfig] 0-DISTRIBUTED-client-0: changing port to 24009 (fro
m 0)
[2013-01-11 18:18:49.258045] E [client-handshake.c:1695:client_query_portmap_cbk] 0-DISTRIBUTED-client-1: failed to ge
t the port number for remote subvolume
[2013-01-11 18:18:49.258087] I [client.c:2098:client_rpc_notify] 0-DISTRIBUTED-client-1: disconnected

Version-Release number of selected component (if applicable):

glusterfs-3.3.0.5rhs-40.el6rhs.x86_64

How reproducible:

I consistently see this when starting a volume for the first time after creating a cluster.

Steps to Reproduce:
1.  Create new cluster.
2.  Peer Probe
3.  Start Volume.
4.  Mount over NFS.
  
Actual results:

Everything mounts and works properly, but the error message shows in logs.

Expected results:

No errors.

Additional info:

Comment 2 Amar Tumballi 2013-01-15 07:29:14 UTC
This is possible because of the race between starting of the brick process and NFS server process. (they are started in parallel when a 'gluster volume start' is issued)

This race won't exist in RHS 2.1 (glusterfs-3.4.x + versions). Not a proper work around exists today.

Raghavendra Talur, Please confirm the behavior on 3.4.0 branch and update the bug.

Comment 3 Ben Turner 2013-01-16 17:00:48 UTC
I can confirm that this doesn't happen on 3.4 branch.  I tested with glusterfs-3.4.0qa5-1.el6rhs.x86_64.

Comment 4 Amar Tumballi 2013-01-31 06:43:37 UTC
Any thoughts on what should be the status of this issue? I guess we should close it as WORKSFORME, as we decided 2.0.z would have only critical fixes after update 4.

Will be closing this as WORKSFORME, and please re-open if this is a serious issue and we should include in 2.0.z branch.