+++ This bug was initially created as a clone of Bug #1121015 +++ Description of problem: ========================== After a successful "replace-brick commit force" on a brick in a 2 x 2 distribute-replicate volume, executed "gluster volume heal <volume_name> full". The command failed with "Commit failed on rhs-client12. Please check log file for details." replace-brick commit force with source and destination bricks path being same had failed : Refer to bug https://bugzilla.redhat.com/show_bug.cgi?id=1120991 . Hence changed the destination brick path after which replace-brick commit force passed. Later executed "gluster volume heal <volume_name> full". NOTE : ping-timeout to set to 0 on all the nodes. Time on all the nodes are in sync. Version-Release number of selected component (if applicable): =============================================================== glusterfs 3.6.0.24 built on Jul 3 2014 11:03:38 How reproducible: ================ executed the same command after the failure , the command execution failed once again. Steps to Reproduce: ======================= 1. Create a 2 x 2 distribute-replicate volume. Start the volume. 2. Create a fuse mount. Create files/directories from fuse mount. 3. Simulate disk replacement on brick1 (node1). i.e Kill the brick process, rm -rf <brick_path> 4. Tried to execute the command replace-brick start, status, commit with same src and dst path . The commands were not successful 5. Tried to replace the brick using commit force with same src and dst path multiple times .This failed (Refer to bug : https://bugzilla.redhat.com/show_bug.cgi?id=1120991) 6. Executed replace-brick commit force with different src and dst brick path . The command execution was successful. 7. to trigger heal to the replaced-brick executed "gluster v heal <volume_name> full" . Actual results: =================== root@mia [Jul-18-2014-12:07:43] >gluster v heal vol1 full Commit failed on rhs-client12. Please check log file for details. Expected results: ==================== triggering heal shouldn't fail. Additional info: =================== root@mia [Jul-18-2014-11:33:05] >gluster v info Volume Name: vol1 Type: Distributed-Replicate Volume ID: 5cc2e193-af63-45b5-834f-9bd757cf4e84 Status: Started Snap Volume: no Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: rhs-client11:/rhs/device0/b1_replaced Brick2: rhs-client12:/rhs/device0/b2 Brick3: rhs-client13:/rhs/device0/b3 Brick4: rhs-client14:/rhs/device0/b4 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable root@mia [Jul-18-2014-11:34:18] > root@mia [Jul-18-2014-11:34:20] > root@mia [Jul-18-2014-11:34:20] > root@mia [Jul-18-2014-11:34:20] >gluster v replace-brick vol1 rhs-client11:/rhs/device0/b1_replaced rhs-client11:/rhs/device0/b1 commit force volume replace-brick: success: replace-brick commit successful root@mia [Jul-18-2014-11:34:48] >gluster v status vol1 Status of volume: vol1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client11:/rhs/device0/b1 49154 Y 11178 Brick rhs-client12:/rhs/device0/b2 49152 Y 14924 Brick rhs-client13:/rhs/device0/b3 49152 Y 22192 Brick rhs-client14:/rhs/device0/b4 49152 Y 28035 NFS Server on localhost 2049 Y 6761 Self-heal Daemon on localhost N/A Y 6766 NFS Server on rhs-client14 2049 Y 10569 Self-heal Daemon on rhs-client14 N/A Y 10574 NFS Server on rhs-client13 2049 Y 4738 Self-heal Daemon on rhs-client13 N/A Y 4743 NFS Server on rhs-client11 2049 Y 11187 Self-heal Daemon on rhs-client11 N/A Y 11192 NFS Server on rhs-client12 2049 Y 29698 Self-heal Daemon on rhs-client12 N/A Y 29703 Task Status of Volume vol1 ------------------------------------------------------------------------------ There are no active volume tasks root@mia [Jul-18-2014-11:34:56] >gluster v info vol1 Volume Name: vol1 Type: Distributed-Replicate Volume ID: 5cc2e193-af63-45b5-834f-9bd757cf4e84 Status: Started Snap Volume: no Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: rhs-client11:/rhs/device0/b1 Brick2: rhs-client12:/rhs/device0/b2 Brick3: rhs-client13:/rhs/device0/b3 Brick4: rhs-client14:/rhs/device0/b4 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable root@mia [Jul-18-2014-11:38:44] > root@mia [Jul-18-2014-11:38:44] >gluster v heal vol1 full Commit failed on rhs-client12. Please check log file for details. root@mia [Jul-18-2014-11:38:56] > root@mia [Jul-18-2014-11:39:27] >gluster v status vol1 nfs Status of volume: vol1 Gluster process Port Online Pid ------------------------------------------------------------------------------ NFS Server on localhost 2049 Y 6761 NFS Server on rhs-client14 2049 Y 10569 NFS Server on rhs-client13 2049 Y 4738 NFS Server on rhs-client12 2049 Y 29698 NFS Server on rhs-client11 2049 Y 11187 root@mia [Jul-18-2014-11:39:31] >less /var/log/glusterfs/etc-glusterfs-glusterd.vol.log root@mia [Jul-18-2014-11:43:42] > root@mia [Jul-18-2014-11:43:42] >gluster peer status Number of Peers: 4 Hostname: rhs-client12 Uuid: b43a950a-d945-47b4-8a6e-8f1c5063c9f5 State: Peer in Cluster (Connected) Hostname: rhs-client14 Uuid: b71c223c-ab9c-4ecf-b935-128ec1f8b2f5 State: Peer in Cluster (Connected) Hostname: rhs-client11 Uuid: 809e3daf-d43b-49e9-91e4-6eadff2875db State: Peer in Cluster (Connected) Hostname: rhs-client13 Uuid: ad6f2447-ed9d-4f78-ad50-503d6aec2a88 State: Peer in Cluster (Connected) root@mia [Jul-18-2014-12:07:43] >gluster v heal vol1 full Commit failed on rhs-client12. Please check log file for details. root@mia [Jul-18-2014-12:27:09] > --- Additional comment from on 2014-07-18 03:33:49 EDT --- SOS Reports : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1121015/ --- Additional comment from RHEL Product and Program Management on 2014-07-18 03:43:14 EDT --- Since this issue was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from Kaushal on 2014-07-21 06:20:33 EDT --- The problem is that the self-heal daemons are trying to connect to the old bricks port. The brick client11:b1_replaced was replaced with client11:b1. b1_replaced had the port 49152, and b1 got a new port 49154 (which can be observed in the volume status output). Looking at the self-heal daemon logs, we know that self-heal daemons were notified of the graph change and did change their graphs. But, the client xlator, client-0, was attempting to connect to client11:49152 (old brick) instead client11:49154 (new brick). This meant that only 1 afr child was up. As heal requires 2 or more children to be up, the heal was failing. --- Additional comment from Ravishankar N on 2014-07-21 07:26:14 EDT --- On attaching gdb to the selfhealdaemon of rhs-client12, it was observed that shd was querying glusterd for the port number of "/rhs/device0/b1" which is the currenly alive brick process but the answer it got was 49152 instead of 49154. -------------------------------- (gdb) b client_query_portmap Breakpoint 2 at 0x7ff9da3314d0: file client-handshake.c, line 1528. (gdb) c Continuing. Breakpoint 2, client_query_portmap (this=0xe6c4c0, rpc=0xf477f0) at client-handshake.c:1528 (gdb) n (gdb) p remote_subvol $2 = 0xe66c10 "/rhs/device0/b1" (gdb) c Continuing. Breakpoint 1, client_query_portmap_cbk (req=0x7ff9d8f94298, iov=0x7ff9d8f942d8, count=1, myframe=0x7ff9e361ea90) at client-handshake.c:1463 (gdb) n (gdb) p rsp.port $3 = 49152 ---------------- What needs to be figured out is why glusterd sends a stale port number for the portmap query. --- Additional comment from Kaushal on 2014-07-22 02:11:17 EDT --- Checked glusterd on rhs-client11. Turns out the pmap signout hadn't happened for the, which had lead to the pmap registry containing stale entries. Because of the way the way the pmap registry is searched, glusterd is returning the stale, old port entry for the brick b1. ---- gdb output (gdb) p $1.pmap $2 = (struct pmap_registry *) 0x7f4168358010 (gdb) p *$2 $3 = {base_port = 49152, last_alloc = 49154, ports = {{type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 22 times>, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 85 times>, { type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 908 times>, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 4642 times>, { type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 18340 times>, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 20559 times>, {type = GF_PMAP_PORT_FOREIGN, brickname = 0x0, xprt = 0x0}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 4584 times>, {type = GF_PMAP_PORT_BRICKSERVER, brickname = 0x1fe4880 "/rhs/device0/b1", xprt = 0x1fd2e80}, {type = GF_PMAP_PORT_BRICKSERVER, brickname = 0x1fdc3c0 "/rhs/device0/b1_replaced", xprt = 0x1fd4400}, {type = GF_PMAP_PORT_BRICKSERVER, brickname = 0x1fceae0 "/rhs/device0/b1", xprt = 0x1fd4400}, {type = GF_PMAP_PORT_FREE, brickname = 0x0, xprt = 0x0} <repeats 16381 times>}} (gdb) ---- As can be seen in the gdb output above, in the pmap.ports table, there are 2 entries for '/rhs/device0/b1'. The sequence of replace-brick commands, b1 -> b1_replaced -> b1, and the apparent failure of pmap_signout to happen have lead to this situation.
I suspect that this bug could also be avoided if we captured a replace-brick where the source is the same as the destination in the cli and suggested simply "start ... force" instead.
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.