Bug 616055 - [vdsm] [libvirt] (scale) OSError: [Errno 11] Resource temporarily unavailable (python leak?)
[vdsm] [libvirt] (scale) OSError: [Errno 11] Resource temporarily unavailable...
Status: CLOSED DUPLICATE of bug 650588
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm (Show other bugs)
6.1
All Linux
low Severity medium
: rc
: ---
Assigned To: Dan Kenigsberg
yeylon@redhat.com
:
Depends On: 650588
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-19 10:33 EDT by Haim
Modified: 2016-04-18 02:33 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-01-02 05:40:27 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
oserror.vdsm.log (584.61 KB, application/x-gzip)
2010-07-19 10:33 EDT, Haim
no flags Details
lsof.oserror.vdsm (6.33 KB, application/x-gzip)
2010-07-19 10:36 EDT, Haim
no flags Details

  None (edit)
Description Haim 2010-07-19 10:33:21 EDT
Created attachment 432905 [details]
oserror.vdsm.log

Description of problem:

the following issue happened twice so far running on particular setup (scale of 180 vsm - 3 per host), where I start to get the following messages in vdsm log:
 
OSError: [Errno 11] Resource temporarily unavailable

it seems like vdsm has 827 open files (using lsof) - see attachment. 

when it occurs system stop to function, host goes to non-operational, and there is nothing to do (maybe kill vdsm service and try again). 

Thread-6432::ERROR::2010-07-19 16:09:11,523::misc::58::irs::[Errno 11] Resource temporarily unavailable
Thread-6494::ERROR::2010-07-19 16:09:11,523::misc::59::irs::Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 973, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1440, in public_getVolumeSize
    apparentsize = str(volume.Volume.getVSize(sdUUID, spUUID, imgUUID, volUUID, bs=1))
  File "/usr/share/vdsm/storage/volume.py", line 249, in getVSize
    return mysd.getVolumeClass().getVSize(mysd, imgUUID, volUUID, bs)
  File "/usr/share/vdsm/storage/blockVolume.py", line 45, in getVSize
    return int(int(sdobj.vg.getLVInfo(volUUID))/bs)
  File "/usr/share/vdsm/storage/vg.py", line 727, in getLVInfo
    return self.lvSize(name)
  File "/usr/share/vdsm/storage/vg.py", line 720, in lvSize
    (rc, out, err) = self.syncExecCmd(name, cmd, exclusive=True)
  File "/usr/share/vdsm/storage/vg.py", line 89, in syncExecCmd
    return misc.execCmd(cmd)
  File "/usr/share/vdsm/storage/misc.py", line 102, in execCmd
    stdin=infile, stdout=outfile, stderr=subprocess.PIPE)
  File "/usr/lib64/python2.6/subprocess.py", line 595, in __init__
    if startupinfo is not None:
  File "/usr/lib64/python2.6/subprocess.py", line 1009, in _execute_child
    fcntl.fcntl(fd, fcntl.F_SETFD, old | cloexec_flag)
OSError: [Errno 11] Resource temporarily unavailable

we are not sure if its vdsm leak or pythons. it requires further investigation. 

repro steps (might be hard to reproduce, though it happened twice, so I decided to open it). 

1) make sure setup consist of a 2 hosts or more which runs 60 vms over iscsi
2) reboot libvirtd service 
3) kill some of the vms with kill -9  
4) start more vms
Comment 1 Haim 2010-07-19 10:36:25 EDT
Created attachment 432907 [details]
lsof.oserror.vdsm
Comment 2 Barak 2010-11-28 11:11:42 EST
It depends on bug 650588.
Added conditional nak on capacity, once the python bug is fixed the issue will be re-examined
Comment 3 Dan Kenigsberg 2011-01-02 05:40:27 EST

*** This bug has been marked as a duplicate of bug 650588 ***

Note You need to log in before you can comment on or make changes to this bug.