|Summary:||[Brick-Mux] gluster vol stop fails with Error : Request timed out.|
|Product:||Red Hat Gluster Storage||Reporter:||Kshithij Iyer <kiyer>|
|Component:||glusterd||Assignee:||Milind Changire <mchangir>|
|Status:||CLOSED ERRATA||QA Contact:||Bala Konda Reddy M <bmekala>|
|Version:||rhgs-3.4||CC:||kaushal, kiyer, mchangir, rcyriac, rhs-bugs, sanandpa, sankarshan, sheggodu, srakonde, storage-qa-internal, ubansal, vbellur|
|Target Milestone:||---||Keywords:||Regression, ZStream|
|Target Release:||RHGS 3.4.z Batch Update 1|
|Fixed In Version:||glusterfs-3.12.2-24||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|Last Closed:||2018-10-31 08:46:58 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:|
Description Kshithij Iyer 2018-10-22 06:55:20 UTC
Description of problem: On a three node cluster(N1,N2,N3) set cluster.brick-multiplex to enable and storage.fips-mode-rchecksum to enable for three 1x3 volumes(vol1,vol2,vol3). Now start all volumes and stop all volumes one by one. The first volume(vol1) stops normally after which if you try to stop the other volumes the volume stop command fails with "Error : Request timed out". After which the cluster goes into an inconsistent state where the volumes(vol2 and vol3) appears to be stopped from a node and from the another node it appears to be running. -------------------------------------------------------------------------------- # gluster vol set vol1 storage.fips-mode-rchecksum enable volume set: success # gluster vol set vol2 storage.fips-mode-rchecksum enable volume set: success # gluster vol set vol3 storage.fips-mode-rchecksum enable volume set: success # for number in `seq 1 3`; do gluster vol start vol$number; done volume start: vol1: success volume start: vol2: success volume start: vol3: success #gluster vol stop vol1 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: vol1: success # gluster vol stop vol2 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y Error : Request timed out # gluster vol stop vol3 Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y Error : Request timed out -------------------------------------------------------------------------------- Version-Release number of selected component (if applicable): 22.214.171.124 OS Version: Red Hat Enterprise Linux 7.6 How reproducible: Always(With and without fips) Steps to Reproduce: 1.On a three node cluster set cluster.brick-multiplex to enable. gluster vol set all cluster.brick-multiplex enable 2.Create 3 volumes of 1x3 type say vol1, vol2 and vol3. 3.Enable fips on for all the three volumes. gluster vol set vol3 storage.fips-mode-rchecksum enable 4.Start all the three volumes. for number in `seq 1 3`; do gluster vol start vol$number; done 5.Stop all the three volumes one by one. Actual results: Volume stop fails with "Error : Request timed out" and cluster goes into inconsistent state. Expected results: All volumes should stop successfully. Additional info: In the builds 126.96.36.199 and 188.8.131.52 this bug was not found. I had started and stopped 100+ volumes on build 184.108.40.206 and haven't observed this bug.
Comment 10 Sanju 2018-10-23 06:30:32 UTC
I am able to reproduce this issue in my local setup. Steps I've followed reproduce this issue is: 1. enable brick-mux 2. create 3 volumes 3. start all the 3 volumes 4. stop 1st volumes, It will be success 5. stop 2nd volume, it will time out. While debugging, I've observed that glusterd is hunging in gd_lock_op_phase() during gd_synctask_barrier_wait(). In glusterd log, we are not seeing any messages regarding 2nd volume stop transaction since glusterd is hung before adding any logs. In brick logs we see the below errors: [2018-10-22 18:26:16.395583] I [glusterfsd-mgmt.c:367:glusterfs_handle_terminate] 0-glusterfs: detaching not-only child /bricks/brick1/test_1 [2018-10-22 18:1588:6.395931] I [server.c:480:server_call_xlator_mem_cleanup] 0-test_1-server: Create graph janitor thread for brick /bricks/brick1/test_1 [2018-10-22 18:26:16.396138] E [xlator.c:1223:glusterfs_delete_volfile_checksum] 0-test_1-server: failed to get volfile checksum for volfile id test_1.server3.bricks-brick1-test_1. [2018-10-22 18:26:16.396190] I [index.c:2587:notify] 0-test_1-index: Notify GF_EVENT_PARENT_DOWN for brick /bricks/brick1/test_1 [2018-10-22 18:26:16.396233] I [io-threads.c:1102:notify] 0-test_1-io-threads: Notify GF_EVENT_PARENT_DOWN for brick /bricks/brick1/test_1 [2018-10-22 18:26:16.396276] I [posix.c:7131:notify] 0-test_1-posix: Sending CHILD_DOWN for brick /bricks/brick1/test_1 [2018-10-22 18:26:16.396306] I [server.c:1652:notify] 0-test_1-server: Getting CHILD_DOWN event for brick /bricks/brick1/test_1 [2018-10-22 18:26:16.396335] I [server.c:756:server_graph_janitor_threads] 0-test_1-server: Start call fini for brick /bricks/brick1/test_1 stack [2018-10-22 18:26:16.398957] E [rpcsvc.c:1424:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not unregister with portmap [2018-10-22 18:26:16.399597] E [rpcsvc.c:1551:rpcsvc_program_unregister] 0-rpc-service: portmap unregistration of program failed [2018-10-22 18:26:16.399631] E [rpcsvc.c:26:1rpcsvc_program_unregister] 0-rpc-service: Program unregistration failed: GlusterFS Changelog, Num: 1885957735, Ver: 1, Port: 0 [2018-10-22 18:26:16.405603] I [barrier.c:662:fini] 0-test_1-barrier: Disabling barriering and dequeuing all the queued fops [2018-10-22 18:26:16.405889] I [io-stats.c:4100:fini] 0-test_1-io-stats: io-stats translator unloaded These errors are observed in brick logs while stopping the 1st volume. I believe that, further volume stop transaction are affected by the 1st volume stop transaction. I'm still working on this to figure out the RCA, I will update the BZ with my findings. Thanks, Sanju
Comment 11 Bala Konda Reddy M 2018-10-23 06:48:57 UTC
(In reply to Kshithij Iyer from comment #0) > # for number in `seq 1 3`; do gluster vol start vol$number; done > volume start: vol1: success > volume start: vol2: success > volume start: vol3: success Tried with enough sleep(3 seconds) for each volume start. > > #gluster vol stop vol1 > Stopping volume will make its data inaccessible. Do you want to continue? > (y/n) y > volume stop: vol1: success > # gluster vol stop vol2 > Stopping volume will make its data inaccessible. Do you want to continue? > (y/n) y > Error : Request timed out > # gluster vol stop vol3 > Stopping volume will make its data inaccessible. Do you want to continue? > (y/n) y > Error : Request timed out > ----------------------------------------------------------------------------- > --- After each volume stop had enough sleep for next operation.
Comment 12 Kaushal 2018-10-23 07:29:06 UTC
Milind and I were able to figure out the root cause. TL;DR, Backporting issue caused stale lock to be held. The brick processes were deadlocked on `ctx->volfile_lock`. Detaching a graph requires that this lock be held. xlator_mem_cleanup, which frees up memory held by xlators also requires this lock. Commit fad234b5a (https://code.engineering.redhat.com/gerrit/#/c/152908/), a backport from upstream, required manual patching to xlator_mem_cleanup (different code location from upstream). An UNLOCK on ctx->volfile_lock was missed during the backport. The first brick detach request would go through successfully, and would trigger the first xlator_mem_cleanup. The subsequent detach requests would timeout waiting for the ctx->volfile_lock, which wasn't released by xlator_mem_cleanup.
Comment 17 errata-xmlrpc 2018-10-31 08:46:58 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3432