Description of problem: High CPU Utilisation when one of the brick is killed in EC volume Version-Release number of selected component (if applicable): 3.8.4-52 How reproducible: Tested only once Steps to Reproduce: 1. Create an EC volume 24*(4+2) 2. Start the volume and run CCTV workload from 4 windows client. eg. Milestone's X-protect 3. Kill one brick from the volume. 4. Monitor the CPU utilisation from the TOP command. Actual results: 1) 500%-600% CPU utilisation was seen. Second Observation:- When all the bricks are up glusterfsd takes 150% CPU utilisation. Expected results: This amount of CPU utilisation shouldn't be observed. Additional info: The software populate with 16MB medium files from 4 windows Clients performance.parallel-readdir on performance.readdir-ahead on performance.quick-read off performance.io-cache off nfs.disable on transport.address-family inet features.cache-invalidation on features.cache-invalidation-timeout 600 performance.stat-prefetch on performance.cache-invalidation on performance.md-cache-timeout 600 network.inode-lru-limit 200000 performance.nl-cache on performance.nl-cache-timeout 600 cluster.lookup-optimize on server.event-threads 4 client.event-threads 6 performance.cache-samba-metadata on performance.client-io-threads on cluster.readdir-optimize on
using the cpu control script I was able to control cpu consumption of shd (however note this is a workaround and not actual fix as already detailed above) moving the bz to verified test version:3.12.2-11 ot@dhcp35-97 scripts]# 30 -bash: 30: command not found [root@dhcp35-97 scripts]# top -n 1 -b|egrep "glusterfs$|RES" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14882 root 20 0 3089444 157524 3712 S 288.2 2.0 7:12.13 glusterfs 14872 root 20 0 538516 9612 3592 S 0.0 0.1 0:00.17 glusterfs [root@dhcp35-97 scripts]# top -n 1 -b|egrep "glusterfs$|RES" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14882 root 20 0 3089436 153312 3712 S 244.4 1.9 7:25.34 glusterfs 14872 root 20 0 538516 9612 3592 S 0.0 0.1 0:00.17 glusterfs [root@dhcp35-97 scripts]# ./control-cpu-load.sh Enter gluster daemon pid for which you want to control CPU. ^C [root@dhcp35-97 scripts]# ./control-cpu-load.sh Enter gluster daemon pid for which you want to control CPU. Entered daemon_pid is not numeric so Rerun the script. [root@dhcp35-97 scripts]# ./control-cpu-load.sh Enter gluster daemon pid for which you want to control CPU. 14882 If you want to continue the script to attach 14882 with new cgroup_gluster_14882 cgroup Press (y/n)? invalid [root@dhcp35-97 scripts]# ./control-cpu-load.sh Enter gluster daemon pid for which you want to control CPU. Entered daemon_pid is not numeric so Rerun the script. [root@dhcp35-97 scripts]# ./control-cpu-load.sh Enter gluster daemon pid for which you want to control CPU. 14882 If you want to continue the script to attach 14882 with new cgroup_gluster_14882 cgroup Press (y/n)?y yes Creating child cgroup directory 'cgroup_gluster_14882 cgroup' for glusterd.service. Enter quota value in range [10,100]: 50 Entered quota value is 50 Setting 50000 to cpu.cfs_quota_us for gluster_cgroup. Tasks are attached successfully specific to 14882 to cgroup_gluster_14882. [root@dhcp35-97 scripts]# top -n 1 -b|egrep "glusterfs$|RES" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14882 root 20 0 3089488 157564 3712 S 58.8 2.0 8:27.61 glusterfs 14872 root 20 0 538516 9612 3592 S 0.0 0.1 0:00.17 glusterfs [root@dhcp35-97 scripts]# 14882 -bash: 14882: command not found [root@dhcp35-97 scripts]# top -n 1 -b|egrep "glusterfs$|RES" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14882 root 20 0 3089456 159556 3712 S 50.0 2.0 9:01.75 glusterfs 14872 root 20 0 538516 9612 3592 S 0.0 0.1 0:00.18 glusterfs [root@dhcp35-97 scripts]# 14882 -bash: 14882: command not found [root@dhcp35-97 scripts]# 14882 -bash: 14882: command not found [root@dhcp35-97 scripts]# top -n 1 -b|egrep "glusterfs$|RES" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14882 root 20 0 3089480 153672 3712 S 33.3 1.9 9:04.91 glusterfs 14872 root 20 0 538516 9612 3592 S 0.0 0.1 0:00.18 glusterfs [root@dhcp35-97 scripts]# pwd /usr/share/glusterfs/scripts [root@dhcp35-97 scripts]# ^C [root@dhcp35-97 scripts]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607