Description of problem: On a 3 node gluster cluster with 1000+ volumes `pidof glusterfsd` returns only one single pid instead of returning 4 pids even when the default value of cluster.max-bricks-per-process is set to 250. # gluster v get all all Option Value ------ ----- cluster.server-quorum-ratio 51 cluster.enable-shared-storage disable cluster.op-version 31304 cluster.max-op-version 31304 cluster.brick-multiplex enable cluster.max-bricks-per-process 250 cluster.daemon-log-level INFO # pidof glusterfsd 31746 Version-Release number of selected component (if applicable): glusterfs-3.12.2-31 How reproducible: 1/1 Steps to Reproduce: 1.Set cluster.brick-multiplex to enable. # gluster v get all all Option Value ------ ----- cluster.server-quorum-ratio 51 cluster.enable-shared-storage disable cluster.op-version 31304 cluster.max-op-version 31304 cluster.brick-multiplex enable cluster.max-bricks-per-process 250 cluster.daemon-log-level INFO 2.Create 1000+ volumes of type replica 1x3. # gluster v info Volume Name: vol_1 Type: Replicate Volume ID: 1c8ed4f9-247e-4c96-afbc-18363077ee0a Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: dhcp35-34.lab.eng.blr.redhat.com:/bricks/brick1/vol_1 Brick2: dhcp35-113.lab.eng.blr.redhat.com:/bricks/brick1/vol_1 Brick3: dhcp35-229.lab.eng.blr.redhat.com:/bricks/brick1/vol_1 Options Reconfigured: transport.address-family: inet nfs.disable: on performance.client-io-threads: off cluster.brick-multiplex: enable .......... .......... 3.Execute `pidof glusterfsd` command on any node.(It'll return only a single pid) # pidof glusterfsd 31746 Actual results: `pidof glusterfsd` returns only a single pid. (Observed across all nodes.) # pidof glusterfsd 31746 Expected results: `pidof glusterfsd` should return 4 pids which means each process will have 250 volumes attached to it. # pidof glusterfsd 31746 <xxxxx> <xxxxx> <xxxxx> Additional info: Even after stopping and starting all volumes `pidof glusterfsd` was returning only one pid instead of returning four pids.
Root cause: In get_mux_limit_per_process (), glusterd looks for the cluster.max-bricks-per-process value from the global option dictionary but never falls back to the default from the global option table in case the global option dictionary doesn't have it which is usual until and unless the option is not reconfigured.
upstream patch : https://review.gluster.org/#/c/glusterfs/+/21819
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3827