Bug 616055
Summary: | [vdsm] [libvirt] (scale) OSError: [Errno 11] Resource temporarily unavailable (python leak?) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Haim <hateya> | ||||||
Component: | vdsm | Assignee: | Dan Kenigsberg <danken> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | yeylon <yeylon> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 6.1 | CC: | abaron, bazulay, hateya, iheim, mgoldboi, Rhev-m-bugs, smizrahi, srevivo, yeylon, ykaul | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-01-02 10:40:27 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 650588 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Created attachment 432907 [details]
lsof.oserror.vdsm
It depends on bug 650588. Added conditional nak on capacity, once the python bug is fixed the issue will be re-examined *** This bug has been marked as a duplicate of bug 650588 *** |
Created attachment 432905 [details] oserror.vdsm.log Description of problem: the following issue happened twice so far running on particular setup (scale of 180 vsm - 3 per host), where I start to get the following messages in vdsm log: OSError: [Errno 11] Resource temporarily unavailable it seems like vdsm has 827 open files (using lsof) - see attachment. when it occurs system stop to function, host goes to non-operational, and there is nothing to do (maybe kill vdsm service and try again). Thread-6432::ERROR::2010-07-19 16:09:11,523::misc::58::irs::[Errno 11] Resource temporarily unavailable Thread-6494::ERROR::2010-07-19 16:09:11,523::misc::59::irs::Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 973, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/hsm.py", line 1440, in public_getVolumeSize apparentsize = str(volume.Volume.getVSize(sdUUID, spUUID, imgUUID, volUUID, bs=1)) File "/usr/share/vdsm/storage/volume.py", line 249, in getVSize return mysd.getVolumeClass().getVSize(mysd, imgUUID, volUUID, bs) File "/usr/share/vdsm/storage/blockVolume.py", line 45, in getVSize return int(int(sdobj.vg.getLVInfo(volUUID))/bs) File "/usr/share/vdsm/storage/vg.py", line 727, in getLVInfo return self.lvSize(name) File "/usr/share/vdsm/storage/vg.py", line 720, in lvSize (rc, out, err) = self.syncExecCmd(name, cmd, exclusive=True) File "/usr/share/vdsm/storage/vg.py", line 89, in syncExecCmd return misc.execCmd(cmd) File "/usr/share/vdsm/storage/misc.py", line 102, in execCmd stdin=infile, stdout=outfile, stderr=subprocess.PIPE) File "/usr/lib64/python2.6/subprocess.py", line 595, in __init__ if startupinfo is not None: File "/usr/lib64/python2.6/subprocess.py", line 1009, in _execute_child fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag) OSError: [Errno 11] Resource temporarily unavailable we are not sure if its vdsm leak or pythons. it requires further investigation. repro steps (might be hard to reproduce, though it happened twice, so I decided to open it). 1) make sure setup consist of a 2 hosts or more which runs 60 vms over iscsi 2) reboot libvirtd service 3) kill some of the vms with kill -9 4) start more vms