Bug 1852736

Summary: High CPU usage on EC volume after inservice upgrade of one node in 3 node cluster
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Leela Venkaiah Gangavarapu <lgangava>
Component: disperseAssignee: Xavi Hernandez <jahernan>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.5CC: aspandey, dwalveka, jahernan, pprakash, puebele, rhs-bugs, rkothiya, sajmoham, sheggodu, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.5.z Batch Update 3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-6.0-41 Doc Type: Enhancement
Doc Text:
Earlier, Gluster kept trying to heal the files which failed and remain unhealed consuming a significant amount of CPU. With this enhancement, Gluster has a better way to detect when continuous healing is necessary and reduces CPU utilization when pending heals cannot be immediately healed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-17 04:51:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1853594    
Bug Blocks:    
Attachments:
Description Flags
CPU usage on server and client none

Description Leela Venkaiah Gangavarapu 2020-07-01 08:06:20 UTC
Created attachment 1699439 [details]
CPU usage on server and client

Description of problem:
High CPU usage is being observed after in-service upgrade of one node in a 3 node cluster

Version-Release number of selected component (if applicable):
glusterfs-server-6.0-37.1.el7rhgs.x86_64

How reproducible:
Consistent

Steps to Reproduce:
1. A cluster with 3 nodes hosting 4X(4+2) dist-disp vol and 3X3 repl vol
2. Upgraded one of the nodes when dist-disp is ~5% full and repl is ~35% full
3. Monitoring CPU during the heal process and post heal process
4. Saw a sudden spike in CPU and still continuing even after heal is complete
5. Observe CPU spikes for gluster process on servers
6. Command used "$ top -c -p $(pgrep -d',' -f gluster)" on client and server

Actual results:
CPU Usage spikes reaching ~700-800%

Expected results:
CPU Usage should be moderate <100%

Additional info:
- No heals are pending
- The other two nodes are in `glusterfs-6.0-37.el7rhgs.x86_64` version

Comment 29 errata-xmlrpc 2020-12-17 04:51:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5603