RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 659351 - [vdsm] [service] vdsm doesn't attempt to restart as required when connection to libvirt is broken
Summary: [vdsm] [service] vdsm doesn't attempt to restart as required when connection ...
Keywords:
Status: CLOSED DUPLICATE of bug 591506
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.1
Hardware: x86_64
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Dan Kenigsberg
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-12-02 15:47 UTC by Haim
Modified: 2016-04-18 06:35 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-06 12:09:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm log. (500.98 KB, application/x-gzip)
2010-12-02 15:52 UTC, Haim
no flags Details

Description Haim 2010-12-02 15:47:45 UTC
Description of problem:

vdsm service enters dead lock after libvirt service was in deadlock, and got restarted. 

vdsm fails to respond to both getVdsCaps and getVdsStats, as well as basic commands such as list table. 

scenario was concurrent multiple migrations, which caused libvirt enter dead lock, after I manually restarted libvirt, vdsm entered dead lock. 
attached gdb output.

when examine vdsm log, i see the following output: 

Thread-10093::DEBUG::2010-12-02 15:51:23,531::libvirtvm::892::vds.vmlog.d2bd1c7a-d0b9-411c-b47f-56deae673db3::(destroy) destroy Called
Thread-9976::ERROR::2010-12-02 15:51:23,532::clientIF::48::vds::(wrapper) Traceback (most recent call last):
  File "/usr/share/vdsm/clientIF.py", line 44, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/clientIF.py", line 439, in destroy
    v.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 900, in destroy
    self._dom.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 146, in f
    raise e
libvirtError: cannot send data: Broken pipe

Thread-10093::ERROR::2010-12-02 15:51:23,536::libvirtvm::1071::vds::(wrapper) connection to libvirt broken. taking vdsm down.
Thread-10093::DEBUG::2010-12-02 15:51:23,537::clientIF::119::vds::(prepareForShutdown) cannot run prepareForShutdown twice
Thread-10128::DEBUG::2010-12-02 15:51:23,538::clientIF::45::vds::(wrapper) return destroy with {'status': {'message': 'Virtual machine does not exist', 'code': 1}}
Thread-10093::ERROR::2010-12-02 15:51:23,540::clientIF::48::vds::(wrapper) Traceback (most recent call last):
  File "/usr/share/vdsm/clientIF.py", line 44, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/clientIF.py", line 439, in destroy
    v.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 900, in destroy
    self._dom.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 146, in f
    raise e
libvirtError: cannot send data: Broken pipe

vdsm-4.9-28.el6.x86_64
libvirt-0.8.1-28.el6.x86_64

Comment 1 Haim 2010-12-02 15:50:23 UTC
please note that when libvirt was fully responsive the time vdsm entered deadlock.

Comment 2 Haim 2010-12-02 15:52:26 UTC
Created attachment 464284 [details]
vdsm log.

Comment 5 Haim 2010-12-06 09:34:49 UTC
Dan, the real problem is vdsm doesn't try to kill itself in case connection to libvirt is broken. 
this is a regression. 


[root@nott-vds1 ~]# ps -o etime `pgrep libvirt`
    ELAPSED
      07:46


Thread-3170::ERROR::2010-12-06 11:17:44,503::utils::424::vds.vmlog.d2bd1c7a-d0b9-411c-b47f-56deae673db3::(run) Traceback (most recent call last):
  File "/usr/share/vdsm/utils.py", line 416, in run
    self._samples.append(self.sample())
  File "/usr/share/vdsm/vm.py", line 132, in sample
    s = self.VmSampleClass(self._pid, self._ifids, self._vm)
  File "/usr/share/vdsm/libvirtvm.py", line 75, in __init__
    raise e
libvirtError: cannot send data: Broken pipe

easy to reproduce.

Comment 6 Dan Kenigsberg 2010-12-06 12:09:13 UTC
I believe this is a duplicate of the now-reopened bug 591506

Comment 7 Dan Kenigsberg 2010-12-06 12:09:26 UTC

*** This bug has been marked as a duplicate of bug 591506 ***


Note You need to log in before you can comment on or make changes to this bug.