Bug 1537423

Summary:	glusterd consumes 100% CPU utilization during reboot with quota and snapshot
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	hari gowtham <hgowtham>
Component:	quota	Assignee:	hari gowtham <hgowtham>
Status:	CLOSED WONTFIX	QA Contact:	Rahul Hinduja <rhinduja>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.3	CC:	amukherj, jstrunk, rhinduja, rhs-bugs, sheggodu, srangana, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	RHGS-3.4.0-to-be-deferred
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1558831 (view as bug list)		Environment:
Last Closed:	2018-11-19 09:12:25 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1558831

Description hari gowtham 2018-01-23 07:40:05 UTC

Description of problem:
On a OSIO gluster volume where there are a lot of quota limits set with snapshot created and deleted, the reboot takes a lot of time and the CPU usage goes to 100%

Version-Release number of selected component (if applicable):
3.3.1

How reproducible:
1/1

Steps to Reproduce:
1.create a replica 3 volume
2.enable snap scheduler
3.create 5k quota limits
4.create 7 snaps
5.take a node down gracefully
6.delete 1 snap while a snap is down
7.bring the node up

Actual results:
time taken for reboot is a lot. 

Expected results:
lesser time to be consumed for reboot.

Additional info:
after deleting all snaps the reboot was fine.

Comment 3 hari gowtham 2018-01-23 08:00:15 UTC

#5  0x00007f2b03c44dbf in glusterd_vol_add_quota_conf_to_dict () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so
#6  0x00007f2b03d07b63 in glusterd_add_snap_to_dict () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so
#7  0x00007f2b03d07fa1 in glusterd_add_snapshots_to_export_dict () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so
#8  0x00007f2b03c66852 in glusterd_rpc_friend_add () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so


The function glusterd_vol_add_quota_conf_to_dict is called per snapshot/volume.
This function reads one entry at a time and populates the dictionary.
so total number of dictionary populate will be.
==> no of limits * no of snapshots/volumes.

We are not clear why it must be read per snapshot as there is just one quota.conf file per volume.
From the first look it seems we are redundantly populating the same values multiple times in the dictionary.

Also, we are not sure why the quota.conf file must be populated one entry at a time?
We could as well have read larger chunks of file and dumped it onto quota.con in the friend node.
(avoiding multiple read/write calls on both nodes.)
Looking further into this.

Comment 14 hari gowtham 2018-11-19 09:12:25 UTC

Hi,

I'm closing this bug as we are not actively working on Quota.

-Hari.