Bug 713659

Summary: [vdsm] When libvirt is non responsive and restarting libvirtd prepareForShutdown hangs forever.
Product: Red Hat Enterprise Linux 6 Reporter: David Naori <dnaori>
Component: vdsmAssignee: Federico Simoncelli <fsimonce>
Status: CLOSED ERRATA QA Contact: David Naori <dnaori>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.1CC: abaron, bazulay, danken, dnaori, hateya, iheim, mgoldboi, ohochman, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.9-77 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:22:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm log none

Description David Naori 2011-06-16 06:54:22 UTC
Created attachment 504983 [details]
vdsm log

Description of problem:

in case libvirtd is non-responsive restarting libvirtd will triger vdsm prepareForShutdown which will hang forever on:

Thread-264::ERROR::2011-06-16 09:37:43,020::clientIF::63::vds::(wrapper) Traceback (most recent call last):
  File "/usr/share/vdsm/clientIF.py", line 59, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/clientIF.py", line 892, in getVdsCapabilities
    self.machineCapabilities = self._getCapabilities()
  File "/usr/share/vdsm/clientIF.py", line 862, in _getCapabilities
    ',' + ','.join(getCompatibleCpuModels())
  File "/usr/share/vdsm/clientIF.py", line 823, in getCompatibleCpuModels
    in allModels if compatible(model) ]
  File "/usr/share/vdsm/clientIF.py", line 819, in compatible
    return c.compareCPU(xml, 0) in (
  File "/usr/share/vdsm/libvirtconnection.py", line 59, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1446, in compareCPU
    if ret == -1: raise libvirtError ('virConnectCompareCPU() failed', conn=self)
libvirtError: cannot recv data: : Connection reset by peer

(END) 

Version-Release number of selected component (if applicable):
libvirt-0.9.1-1.el6.x86_64
vdsm-4.9-75.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
on a host connected to storage pool and have no vms running do:
  1  kill -19 `pgrep libvirt`
  2  vdsClient -s 0 getVdsCaps &
  3  /etc/init.d/libvirtd restart

Additional info:
*vdsm log attached

Comment 2 Dan Kenigsberg 2011-06-27 20:35:24 UTC
In a private communication Erez told me that the issue no longer reproduces now that Eduardo pool cleanup patchset is in.

David, please reopen if you find refuting evindence.

Comment 3 Omri Hochman 2011-07-03 11:44:28 UTC
unable to reproduce vdsm-4.9-79.el6.x86_64.

Comment 4 errata-xmlrpc 2011-12-06 07:22:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html