Description of problem: On a OSIO gluster volume where there are a lot of quota limits set with snapshot created and deleted, the reboot takes a lot of time and the CPU usage goes to 100% Version-Release number of selected component (if applicable): 3.3.1 How reproducible: 1/1 Steps to Reproduce: 1.create a replica 3 volume 2.enable snap scheduler 3.create 5k quota limits 4.create 7 snaps 5.take a node down gracefully 6.delete 1 snap while a snap is down 7.bring the node up Actual results: time taken for reboot is a lot. Expected results: lesser time to be consumed for reboot. Additional info: after deleting all snaps the reboot was fine.
#5 0x00007f2b03c44dbf in glusterd_vol_add_quota_conf_to_dict () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so #6 0x00007f2b03d07b63 in glusterd_add_snap_to_dict () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so #7 0x00007f2b03d07fa1 in glusterd_add_snapshots_to_export_dict () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so #8 0x00007f2b03c66852 in glusterd_rpc_friend_add () from /usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so The function glusterd_vol_add_quota_conf_to_dict is called per snapshot/volume. This function reads one entry at a time and populates the dictionary. so total number of dictionary populate will be. ==> no of limits * no of snapshots/volumes. We are not clear why it must be read per snapshot as there is just one quota.conf file per volume. From the first look it seems we are redundantly populating the same values multiple times in the dictionary. Also, we are not sure why the quota.conf file must be populated one entry at a time? We could as well have read larger chunks of file and dumped it onto quota.con in the friend node. (avoiding multiple read/write calls on both nodes.) Looking further into this.
Hi, I'm closing this bug as we are not actively working on Quota. -Hari.