Bug 1656924

Summary:	cluster.max-bricks-per-process 250 not working as expected
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Kshithij Iyer <kiyer>
Component:	glusterd	Assignee:	Atin Mukherjee <amukherj>
Status:	CLOSED ERRATA	QA Contact:	Bala Konda Reddy M <bmekala>
Severity:	high	Docs Contact:
Priority:	high
Version:	rhgs-3.4	CC:	abhishku, apaladug, mchangir, rhs-bugs, sanandpa, sankarshan, sheggodu, srakonde, storage-qa-internal, vbellur
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHGS 3.4.z Batch Update 2
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.12.2-32	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1656951 (view as bug list)		Environment:
Last Closed:	2018-12-17 17:07:27 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1656951
Bug Blocks:

Description Kshithij Iyer 2018-12-06 16:38:32 UTC

Description of problem:
On a 3 node gluster cluster with 1000+ volumes `pidof glusterfsd` returns only one single pid instead of returning 4 pids even when the default value of cluster.max-bricks-per-process is set to 250.

# gluster v get all all
Option                                  Value                                  
------                                  -----                                  
cluster.server-quorum-ratio             51                                     
cluster.enable-shared-storage           disable                                
cluster.op-version                      31304                                  
cluster.max-op-version                  31304                                  
cluster.brick-multiplex                 enable                                 
cluster.max-bricks-per-process          250                                    
cluster.daemon-log-level                INFO                        

# pidof glusterfsd
31746

Version-Release number of selected component (if applicable):
glusterfs-3.12.2-31

How reproducible:
1/1

Steps to Reproduce:
1.Set cluster.brick-multiplex to enable.
# gluster v get all all
Option                                  Value                                  
------                                  -----                                  
cluster.server-quorum-ratio             51                                     
cluster.enable-shared-storage           disable                                
cluster.op-version                      31304                                  
cluster.max-op-version                  31304                                  
cluster.brick-multiplex                 enable                                 
cluster.max-bricks-per-process          250                                    
cluster.daemon-log-level                INFO                                

2.Create 1000+ volumes of type replica 1x3.
# gluster v info
Volume Name: vol_1
Type: Replicate
Volume ID: 1c8ed4f9-247e-4c96-afbc-18363077ee0a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp35-34.lab.eng.blr.redhat.com:/bricks/brick1/vol_1
Brick2: dhcp35-113.lab.eng.blr.redhat.com:/bricks/brick1/vol_1
Brick3: dhcp35-229.lab.eng.blr.redhat.com:/bricks/brick1/vol_1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.brick-multiplex: enable
..........
..........

3.Execute `pidof glusterfsd` command on any node.(It'll return only a single pid)
# pidof glusterfsd
31746

Actual results:
`pidof glusterfsd` returns only a single pid. (Observed across all nodes.)
# pidof glusterfsd
31746

Expected results:
`pidof glusterfsd` should return 4 pids which means each process will have 250 volumes attached to it.
# pidof glusterfsd
31746 <xxxxx> <xxxxx> <xxxxx>

Additional info:
Even after stopping and starting all volumes `pidof glusterfsd` was returning only one pid instead of returning four pids.

Comment 2 Atin Mukherjee 2018-12-06 17:17:33 UTC

Root cause:

In get_mux_limit_per_process (), glusterd looks for the cluster.max-bricks-per-process value from the global option dictionary but never falls back to the default from the global option table in case the global option dictionary doesn't have it which is usual until and unless the option is not reconfigured.

Comment 3 Atin Mukherjee 2018-12-06 17:49:01 UTC

upstream patch : https://review.gluster.org/#/c/glusterfs/+/21819

Comment 8 errata-xmlrpc 2018-12-17 17:07:27 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3827