Created attachment 1085099 [details] pictures of reproduction, vdsm,mom,supervdsm,logs Description of problem: Issuing service vdsmd start can sometimes lead to 1 or more vdsm ioprocesses[*] using heavy CPU and unbound memory growth. This occurs after the service starts, and is not affected by host (activation||maintenance). Left over time memory will continue to grow. The issue was originally discovered during a performance test but since isolated to restarting the VDSM service itself. [*] /usr/libexec/ioprocess --read-pipe-fd 22 --write-pipe-fd 21 --max-threads 10 --max-queued-requests 10 Environment Details: 1 Engine ver 3.6 1 Host ver 3.6 with 100 VMs 11 "data" Storage Domains (NFS ver 3) 1 "iso" Storage Domains (NFS ver 1) Version-Release number of selected component (if applicable): vdsm-4.17.9-1.el7ev.noarch How reproducible: Is reproducible but not every time. Steps to Reproduce: 1.stop ovirt-engine, and service vdsmd stop 2.start ovirt-egine, and service vdsmd start 3.watch processes Actual results: See attachment: (pictures of reproduction, vdsm,mom,supervdsm,logs) Expected results: No sustained CPU utilization at 99% and unbound memory growth, and ability to stop VDSM service without FD exception. Additional info:
First of all, ioprocess traceback on vdsm restart is a known issue, and there's a ready fix that will go in when vdsm will only support EL7.2 and up (details in BZ#1189200). I have access to the host now so will investigate what is the origin of the bug.
Mordechai, A new ioprocess version with the patch I and Nir added is installed on your machine now. Can you do some more testing and let me know it does not reproduce? Thanks!
(In reply to Yeela Kaplan from comment #2) > Mordechai, > A new ioprocess version with the patch I and Nir added is installed on your > machine now. > > Can you do some more testing and let me know it does not reproduce? > > Thanks! I have not seen the issue reproduce since you applied this fix.
ok, vdsm-4.17.17-0.el7ev.noarch no high cpu load from ioprocesses observed and no traceback as seen in vdsm-status-ouput.png