Bug 1102701
Summary: | [Scale] - there no correlation about the vm status between the engine to vdsm | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Eldad Marciano <emarcian> | ||||
Component: | vdsm | Assignee: | Nir Soffer <nsoffer> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yuri Obshansky <yobshans> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.3.0 | CC: | amureini, bazulay, fsimonce, gklein, iheim, lpeer, michal.skrivanek, nsoffer, scohen, tnisan, ybronhei, yeylon | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.5.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | vt1.3, 4.16.0-1.el6_5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-02-16 13:41:11 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1083771 | ||||||
Bug Blocks: | 1142923, 1156165 | ||||||
Attachments: |
|
Description
Eldad Marciano
2014-05-29 12:39:54 UTC
WaitForLaunch means that VDSM was asked to start the VM, but no report about the VM has came back yet. So it is expected that the numbers do not agree. Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3085, in getVolumeSize apparentsize = str(volClass.getVSize(dom, imgUUID, volUUID, bs=1)) File "/usr/share/vdsm/storage/fileVolume.py", line 418, in getVSize return int(sdobj.oop.os.stat(volPath).st_size / bs) File "/usr/share/vdsm/storage/remoteFileHandler.py", line 312, in callCrabRPCFunction raise Exception("No free file handlers in pool") Exception: No free file handlers in pool I very much suspect this (repeated) exception to be the reason for the VMs to stay in WaitForLaunch for such a long time. Im CCing storage people as this is outside of the scope of SLA team. Fede/Nir, weren't we working on removing this? 100 % reproduced while maintenance host with 100+ vms. still under investigation. adding logs and findings ASAP. Created attachment 901478 [details]
vdsm.log
grep for 'No free file handlers in pool'
(In reply to Allon Mureinik from comment #4) > Fede/Nir, weren't we working on removing this? I was not working on removing this. This is a limit of the current implementation - there are up to 10 file handlers per domain, and if you are trying to do too many operation concurrently you will get this failure. Will be fixed by bug 1083771. Bug verified on: - RHEV-M 3.5.0-0.22.el6ev - RHEL - 6Server - 6.6.0.2.el6 - libvirt-0.10.2-46.el6_6.1 - vdsm-4.16.7.6-1.el6ev It didn't reproduce. Moved to verified. |