Description of problem: When there is a large number of volumes in a brick multiplex setup, glusterd is taking more time to process a brick disconnect. Because of that other gluster requests are queued in the list. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1.create a gluster cluster 2.enable brick multiplex 3.create 1500 volumes (1*3) 4.stop a single brick process 5. Execute gluster cli commands like peer status Actual results: peer status failed Expected results: peer status should be able to show the correct peer status Additional info:
RCA done by Atin, <snipet> When we kill a brick and glusterd gets a disconnect event we get into a code snippet which turns out to be a very costly loop with too many iterations and with such scale where ~1300 volumes are configured, this thread take minutes which causes the other requests to queue up. if (is_brick_mx_enabled()) { cds_list_for_each_entry (brick_proc, &conf->brick_procs, brick_proc_list) { cds_list_for_each_entry (brickinfo_tmp, &brick_proc->bricks, brick_list) { if (strcmp (brickinfo_tmp->path, brickinfo->path) == 0) { ret = glusterd_mark_bricks_stopped_by_proc (brick_proc); if (ret) { gf_msg(THIS->name, GF_LOG_ERROR, 0, GD_MSG_BRICK_STOP_FAIL, "Unable to stop " "bricks of process" " to which brick(%s)" " belongs", brickinfo->path); goto out; } temp = 1; break; } } if (temp == 1) break; } } </snipet>
REVIEW: https://review.gluster.org/21651 (glusterd/mux: Optimize brick cleanup code) posted (#1) for review on master by mohammed rafi kc
REVIEW: https://review.gluster.org/21651 (glusterd/mux: Optimize brick disconnect handler code) posted (#6) for review on master by Atin Mukherjee
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report. glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html [2] https://www.gluster.org/pipermail/gluster-users/