894556 – Failed to get the port number for remote subvolume error when starting a volume for the first time.

Bug 894556 - Failed to get the port number for remote subvolume error when starting a volume for the first time.

Summary: Failed to get the port number for remote subvolume error when starting a volu...

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.0
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Raghavendra Talur
QA Contact:	Ben Turner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-01-12 06:21 UTC by Ben Turner
Modified:	2013-05-09 15:14 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-01-31 06:43:37 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ben Turner 2013-01-12 06:21:08 UTC

Description of problem:

I am consistently seeing the following errors when starting a volume for the first time:

[2013-01-11 18:18:48.679691] I [rpc-clnt.c:1659:rpc_clnt_reconfig] 0-DISTRIBUTED-client-0: changing port to 24009 (fro
m 0)
[2013-01-11 18:18:49.258045] E [client-handshake.c:1695:client_query_portmap_cbk] 0-DISTRIBUTED-client-1: failed to ge
t the port number for remote subvolume
[2013-01-11 18:18:49.258087] I [client.c:2098:client_rpc_notify] 0-DISTRIBUTED-client-1: disconnected

Version-Release number of selected component (if applicable):

glusterfs-3.3.0.5rhs-40.el6rhs.x86_64

How reproducible:

I consistently see this when starting a volume for the first time after creating a cluster.

Steps to Reproduce:
1.  Create new cluster.
2.  Peer Probe
3.  Start Volume.
4.  Mount over NFS.
  
Actual results:

Everything mounts and works properly, but the error message shows in logs.

Expected results:

No errors.

Additional info:

Comment 2 Amar Tumballi 2013-01-15 07:29:14 UTC

This is possible because of the race between starting of the brick process and NFS server process. (they are started in parallel when a 'gluster volume start' is issued)

This race won't exist in RHS 2.1 (glusterfs-3.4.x + versions). Not a proper work around exists today.

Raghavendra Talur, Please confirm the behavior on 3.4.0 branch and update the bug.

Comment 3 Ben Turner 2013-01-16 17:00:48 UTC

I can confirm that this doesn't happen on 3.4 branch.  I tested with glusterfs-3.4.0qa5-1.el6rhs.x86_64.

Comment 4 Amar Tumballi 2013-01-31 06:43:37 UTC

Any thoughts on what should be the status of this issue? I guess we should close it as WORKSFORME, as we decided 2.0.z would have only critical fixes after update 4.

Will be closing this as WORKSFORME, and please re-open if this is a serious issue and we should include in 2.0.z branch.

Note You need to log in before you can comment on or make changes to this bug.