Bug 1656924

Summary: cluster.max-bricks-per-process 250 not working as expected
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Kshithij Iyer <kiyer>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED ERRATA QA Contact: Bala Konda Reddy M <bmekala>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.4CC: abhishku, apaladug, mchangir, rhs-bugs, sanandpa, sankarshan, sheggodu, srakonde, storage-qa-internal, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.4.z Batch Update 2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-32 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1656951 (view as bug list) Environment:
Last Closed: 2018-12-17 17:07:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1656951    
Bug Blocks:    

Description Kshithij Iyer 2018-12-06 16:38:32 UTC
Description of problem:
On a 3 node gluster cluster with 1000+ volumes `pidof glusterfsd` returns only one single pid instead of returning 4 pids even when the default value of cluster.max-bricks-per-process is set to 250.

# gluster v get all all
Option                                  Value                                  
------                                  -----                                  
cluster.server-quorum-ratio             51                                     
cluster.enable-shared-storage           disable                                
cluster.op-version                      31304                                  
cluster.max-op-version                  31304                                  
cluster.brick-multiplex                 enable                                 
cluster.max-bricks-per-process          250                                    
cluster.daemon-log-level                INFO                        

# pidof glusterfsd
31746

Version-Release number of selected component (if applicable):
glusterfs-3.12.2-31

How reproducible:
1/1

Steps to Reproduce:
1.Set cluster.brick-multiplex to enable.
# gluster v get all all
Option                                  Value                                  
------                                  -----                                  
cluster.server-quorum-ratio             51                                     
cluster.enable-shared-storage           disable                                
cluster.op-version                      31304                                  
cluster.max-op-version                  31304                                  
cluster.brick-multiplex                 enable                                 
cluster.max-bricks-per-process          250                                    
cluster.daemon-log-level                INFO                                

2.Create 1000+ volumes of type replica 1x3.
# gluster v info
Volume Name: vol_1
Type: Replicate
Volume ID: 1c8ed4f9-247e-4c96-afbc-18363077ee0a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp35-34.lab.eng.blr.redhat.com:/bricks/brick1/vol_1
Brick2: dhcp35-113.lab.eng.blr.redhat.com:/bricks/brick1/vol_1
Brick3: dhcp35-229.lab.eng.blr.redhat.com:/bricks/brick1/vol_1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.brick-multiplex: enable
..........
..........

3.Execute `pidof glusterfsd` command on any node.(It'll return only a single pid)
# pidof glusterfsd
31746

Actual results:
`pidof glusterfsd` returns only a single pid. (Observed across all nodes.)
# pidof glusterfsd
31746

Expected results:
`pidof glusterfsd` should return 4 pids which means each process will have 250 volumes attached to it.
# pidof glusterfsd
31746 <xxxxx> <xxxxx> <xxxxx>

Additional info:
Even after stopping and starting all volumes `pidof glusterfsd` was returning only one pid instead of returning four pids.

Comment 2 Atin Mukherjee 2018-12-06 17:17:33 UTC
Root cause:

In get_mux_limit_per_process (), glusterd looks for the cluster.max-bricks-per-process value from the global option dictionary but never falls back to the default from the global option table in case the global option dictionary doesn't have it which is usual until and unless the option is not reconfigured.

Comment 3 Atin Mukherjee 2018-12-06 17:49:01 UTC
upstream patch : https://review.gluster.org/#/c/glusterfs/+/21819

Comment 8 errata-xmlrpc 2018-12-17 17:07:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3827