Bug 1182547 - Unable to connect to a brick when volume is recreated
Unable to connect to a brick when volume is recreated
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: glusterd (Show other bugs)
mainline
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Muthu Vigneshwaran
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-01-15 07:28 EST by Xavi Hernandez
Modified: 2017-08-10 03:45 EDT (History)
4 users (show)

See Also:
Fixed In Version: glusterfs-3.9.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 03:45:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Xavi Hernandez 2015-01-15 07:28:56 EST
Description of problem:

When a volume is destroyed and recreated while one of its bricks is down, the new volume is unable to connect to one of the bricks.

Version-Release number of selected component (if applicable): master


How reproducible:

Always

Steps to Reproduce:
1. glusterd
2. gluster volume create test replica 2 server:/bricks/test{1..2} force
3. gluster volume start test
4. mount -t glusterfs server:/test /gluster/test
5. kill -9 <pid of one brick>
6. umount /gluster/test
7. gluster volume stop test
8. gluster volume delete test
9. rm -rf /bricks/test*
10. gluster volume create test replica 2 server:/bricks/test{1..2} force
11. gluster volume start test
12. mount -t glusterfs server:/test /gluster/test

Actual results:

Everything seems ok, even a gluster volume status shows all bricks as online, however there are messages in logs saying that one brick is not connected and writting data to the volume only writes to one brick.

# gluster volume status
Status of volume: test
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick server:/bricks/test_1                             49154   Y       695
Brick server:/bricks/test_2                             49155   Y       707
NFS Server on localhost                                 2049    Y       721
Self-heal Daemon on localhost                           N/A     Y       727
 
Task Status of Volume test
------------------------------------------------------------------------------
There are no active volume tasks

logs:

[2015-01-15 12:20:04.260781] I [rpc-clnt.c:1765:rpc_clnt_reconfig] 0-test-client-1: changing port to 49153 (from 0)
[2015-01-15 12:20:04.283114] E [socket.c:2276:socket_connect_finish] 0-test-client-1: connection to 192.168.200.61:49153 failed (Connection refused)

Note that the port shown in the log file does not correspond to the port shown in the 'gluster volume status' command. The port in the log is the port used by that brick in the previous volume.

Expected results:

The new volume should connect successfully to the new bricks.

Additional info:

Restarting glusterd solves the problem.
Comment 1 Atin Mukherjee 2017-08-10 03:45:41 EDT
This is fixed through BZ 1334270.

Note You need to log in before you can comment on or make changes to this bug.