I'm seeing high cpu utilization with glusterfsd. CPU utilization goes beyond 700% Please let me know if its somehow misconfigured on my end. Volume Name: dist-vol Type: Distribute Volume ID: 330157c4-49a7-4982-9013-86b5f2e65fc6 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: storage-1:/tank/gshare Brick2: storage-2:/tank/gshare Brick3: storage-3:/tank/gshare Options Reconfigured: cluster.min-free-disk: 1% diagnostics.brick-log-level: ERROR diagnostics.client-log-level: ERROR performance.cache-size: 2GB server.allow-insecure: on performance.io-thread-count: 64 performance.quick-read: on performance.io-cache: on performance.write-behind: on performance.read-ahead: on performance.write-behind-window-size: 1GB performance.cache-max-file-size: 2GB performance.stat-prefetch: on performance.flush-behind: on cluster.data-self-heal: off cluster.entry-self-heal: off cluster.metadata-self-heal: off cluster.lookup-optimize: on server.outstanding-rpc-limit: 128 server.event-threads: 3 client.event-threads: 3
Are you using redhat gluster storage or community version of GlusterFS? This BZ has been filed with the former. If you meant the community version please change the product to GlusterFS. Along with that a little more details like what exact version, workload type with client and brick logs would help us in debugging the issue further. Currently the information provided looks to be inadequate.
Thanks Atin, I have corrected the info. The interesting thing is even with the diagnostics settings to log at error level, no logs are being recorded under /var/log/glusterfs/bricks The load is heavy smb usage. We have thousands of small (KB files) and large (Several GB file sizes) data and the cluster's high load and high glusterfsd cpu utilization tends to shift from node to node. Is there a known bug for this type of behavior? I wasn't able to find answer online nor anyone wasn't able to shed some light on the irc. They encouraged filing a bug for this behavior.
I am moving this BZ to gluster-smb component since as per comment 3 the workload is on SMB.
top - 10:50:42 up 3 days, 4:06, 1 user, load average: 51.29, 47.47, 44.14 Tasks: 645 total, 1 running, 644 sleeping, 0 stopped, 0 zombie %Cpu0 : 50.0 us, 45.3 sy, 0.0 ni, 3.4 id, 0.3 wa, 0.0 hi, 1.0 si, 0.0 st %Cpu1 : 45.9 us, 49.3 sy, 0.0 ni, 4.4 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 46.1 us, 48.5 sy, 0.0 ni, 3.7 id, 1.0 wa, 0.0 hi, 0.7 si, 0.0 st %Cpu3 : 48.1 us, 47.8 sy, 0.0 ni, 3.0 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st %Cpu4 : 31.5 us, 60.6 sy, 0.0 ni, 4.1 id, 1.7 wa, 0.0 hi, 2.1 si, 0.0 st %Cpu5 : 38.0 us, 55.8 sy, 0.0 ni, 2.1 id, 3.1 wa, 0.0 hi, 1.0 si, 0.0 st %Cpu6 : 40.2 us, 52.0 sy, 0.0 ni, 4.7 id, 1.0 wa, 0.0 hi, 2.0 si, 0.0 st %Cpu7 : 39.0 us, 52.9 sy, 0.0 ni, 3.4 id, 2.4 wa, 0.0 hi, 2.4 si, 0.0 st KiB Mem: 13200491+total, 13131869+used, 686212 free, 11228 buffers KiB Swap: 62465916 total, 256332 used, 62209584 free. 11889336+cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6267 root 20 0 8935784 6.642g 3972 S 676.8 5.3 4243:49 glusterfsd 7156 root 20 0 2224496 1.547g 3896 S 23.9 1.2 1490:24 glusterfs
This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.