Bug 1852736

Summary:

High CPU usage on EC volume after inservice upgrade of one node in 3 node cluster

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Leela Venkaiah Gangavarapu <lgangava>

Component:

disperse

Assignee:

Xavi Hernandez <jahernan>

Status:

CLOSED ERRATA

QA Contact:

Manisha Saini <msaini>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

rhgs-3.5

CC:

aspandey, dwalveka, jahernan, pprakash, puebele, rhs-bugs, rkothiya, sajmoham, sheggodu, storage-qa-internal

Target Milestone:

---

Keywords:

ZStream

Target Release:

RHGS 3.5.z Batch Update 3

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

glusterfs-6.0-41

Doc Type:

Enhancement

Doc Text:

Earlier, Gluster kept trying to heal the files which failed and remain unhealed consuming a significant amount of CPU. With this enhancement, Gluster has a better way to detect when continuous healing is necessary and reduces CPU utilization when pending heals cannot be immediately healed.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-12-17 04:51:50 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1853594

Bug Blocks:

Attachments:

Description	Flags
CPU usage on server and client	none

Description Leela Venkaiah Gangavarapu 2020-07-01 08:06:20 UTC

Created attachment 1699439 [details]
CPU usage on server and client

Description of problem:
High CPU usage is being observed after in-service upgrade of one node in a 3 node cluster

Version-Release number of selected component (if applicable):
glusterfs-server-6.0-37.1.el7rhgs.x86_64

How reproducible:
Consistent

Steps to Reproduce:
1. A cluster with 3 nodes hosting 4X(4+2) dist-disp vol and 3X3 repl vol
2. Upgraded one of the nodes when dist-disp is ~5% full and repl is ~35% full
3. Monitoring CPU during the heal process and post heal process
4. Saw a sudden spike in CPU and still continuing even after heal is complete
5. Observe CPU spikes for gluster process on servers
6. Command used "$ top -c -p $(pgrep -d',' -f gluster)" on client and server

Actual results:
CPU Usage spikes reaching ~700-800%

Expected results:
CPU Usage should be moderate <100%

Additional info:
- No heals are pending
- The other two nodes are in `glusterfs-6.0-37.el7rhgs.x86_64` version

Comment 29 errata-xmlrpc 2020-12-17 04:51:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5603