Bug 1273847

Summary: starting vdsmd service sometimes leads to 1 or more ioprocess with aggressive resource utilization
Product: Red Hat Enterprise Virtualization Manager Reporter: mlehrer
Component: vdsmAssignee: Oved Ourfali <oourfali>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Belka <jbelka>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: bazulay, gklein, lsurette, masayag, mlehrer, oourfali, srevivo, ycui, ykaul
Target Milestone: ovirt-3.6.1   
Target Release: 3.6.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-20 01:28:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
pictures of reproduction, vdsm,mom,supervdsm,logs none

Description mlehrer 2015-10-21 11:48:21 UTC
Created attachment 1085099 [details]
pictures of reproduction, vdsm,mom,supervdsm,logs

Description of problem:

Issuing service vdsmd start can sometimes lead to 1 or more vdsm ioprocesses[*] using heavy CPU and unbound memory growth.  This occurs after the service starts, and is not affected by host (activation||maintenance).  Left over time memory will continue to grow.  The issue was originally discovered during a performance test but since isolated to restarting the VDSM service itself.

[*] /usr/libexec/ioprocess --read-pipe-fd 22 --write-pipe-fd 21 --max-threads 10 --max-queued-requests 10   

Environment Details:
1 Engine ver 3.6
1 Host ver 3.6 with 100 VMs
11 "data" Storage Domains (NFS ver 3)
1  "iso"  Storage Domains (NFS ver 1)




Version-Release number of selected component (if applicable):

vdsm-4.17.9-1.el7ev.noarch

How reproducible:
Is reproducible but not every time.

Steps to Reproduce:
1.stop ovirt-engine, and service vdsmd stop
2.start ovirt-egine, and service vdsmd start
3.watch processes

Actual results:
See attachment: (pictures of reproduction, vdsm,mom,supervdsm,logs)

Expected results:
No sustained CPU utilization at 99% and unbound memory growth, and ability to stop VDSM service without FD exception.

Additional info:

Comment 1 Yeela Kaplan 2015-10-22 09:03:48 UTC
First of all, ioprocess traceback on vdsm restart is a known issue, and there's a ready fix that will go in when vdsm will only support EL7.2 and up (details in BZ#1189200).

I have access to the host now so will investigate what is the origin of the bug.

Comment 2 Yeela Kaplan 2015-10-22 14:06:56 UTC
Mordechai,
A new ioprocess version with the patch I and Nir added is installed on your machine now. 

Can you do some more testing and let me know it does not reproduce?

Thanks!

Comment 4 mlehrer 2015-11-04 09:10:18 UTC
(In reply to Yeela Kaplan from comment #2)
> Mordechai,
> A new ioprocess version with the patch I and Nir added is installed on your
> machine now. 
> 
> Can you do some more testing and let me know it does not reproduce?
> 
> Thanks!

I have not seen the issue reproduce since you applied this fix.

Comment 5 Jiri Belka 2016-01-15 18:02:09 UTC
ok, vdsm-4.17.17-0.el7ev.noarch

no high cpu load from ioprocesses observed and no traceback as seen in vdsm-status-ouput.png