Bug 702906

Summary: [vdsm] VDSM finish its process pool on several NFS scenarios (start several VMs at once\block ISO domain)
Product: Red Hat Enterprise Linux 6 Reporter: Haim <hateya>
Component: vdsmAssignee: Igor Lvovsky <ilvovsky>
Status: CLOSED ERRATA QA Contact: Daniel Paikov <dpaikov>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2CC: abaron, bazulay, danken, hateya, iheim, lpeer, mgoldboi, yeylon, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: vdsm-4.9-76.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:17:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm logs none

Description Haim 2011-05-08 09:07:44 UTC
Created attachment 497618 [details]
vdsm logs

Description of problem:

VDSM finish its process pool in case ISO domain goes down (fail to respond).
when blocking ISO NFS domain, start to get the following messages: 


Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sp.py", line 105, in run
    stats, code = self._statsfunc(self._domain)
  File "/usr/share/vdsm/storage/sp.py", line 1391, in _repostats
    stats['masterValidate'] = domain.validateMaster()
  File "/usr/share/vdsm/storage/sd.py", line 438, in validateMaster
    if not oop.fileUtils.pathExists(pdir):
  File "/usr/share/vdsm/storage/processPool.py", line 35, in wrapper
    return self.runExternally(func, *args, **kwds)
  File "/usr/share/vdsm/storage/processPool.py", line 49, in runExternally
    raise NoFreeHelpersError("No free processes")
NoFreeHelpersError: No free processes

Thread-420::ERROR::2011-05-08 11:56:54,613::sp::110::Storage.StatsThread::(run) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sp.py", line 104, in run
    self._domain = SDF.produce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdf.py", line 30, in produce
    newSD = cls.__sdc.lookup(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 91, in lookup
    self._refreshDomains()
  File "/usr/share/vdsm/storage/misc.py", line 1068, in helper
    return sm(*args, **kwargs)
  File "/usr/share/vdsm/storage/misc.py", line 1055, in __call__
    self.__lastResult = self.__func(*args, **kwargs)
  File "/usr/share/vdsm/storage/sdc.py", line 133, in _refreshDomains
    nfsSD.getFileStorageDomainList() +
  File "/usr/share/vdsm/storage/nfsSD.py", line 152, in getFileStorageDomainList
    misc.tmap(collectMetaFiles, domlist)
  File "/usr/share/vdsm/storage/misc.py", line 1126, in tmap
    results[i] = result
IndexError: list assignment index out of range

repro steps: 

1) data domain over iSCSI 
2) ISO domain over NFS 
3) block ISO domain using iptables

Comment 2 RHEL Program Management 2011-05-09 06:00:23 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 9 Igor Lvovsky 2011-06-05 14:48:41 UTC
http://gerrit.usersys.redhat.com/#change,538

Comment 11 Igor Lvovsky 2011-06-15 13:16:40 UTC
We solve this issue by 
http://gerrit.usersys.redhat.com/#change,538

but we still cannot fully verify it because of libvirt bug 692663

Comment 13 Daniel Paikov 2011-08-21 14:53:39 UTC
Couldn't reproduce on 4.9-91. Closing as Verified.

Comment 14 errata-xmlrpc 2011-12-06 07:17:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html