RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 629625 - [vdsm] [libvirt] vdsm loses connection with 'kvm' process during concurrent migration
Summary: [vdsm] [libvirt] vdsm loses connection with 'kvm' process during concurrent m...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.1
Hardware: All
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Dan Kenigsberg
QA Contact: Haim
URL:
Whiteboard:
Depends On: 659310
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-02 14:40 UTC by Haim
Modified: 2014-01-13 00:47 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-21 14:17:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm log files of both pele and silver (1.18 MB, application/x-gzip)
2010-09-02 15:01 UTC, Haim
no flags Details

Description Haim 2010-09-02 14:40:42 UTC
Description of problem:

vdsm losses connection with kvm process during concurrent migration of at least 2 guests per host (must).
it seems like vms actually goes down in rhevm, but qemu process is live and active (also seen by virsh). 
there are some disturbing errors in log which might have caused this state: 

libvirtEventLoop::ERROR::2010-09-02 17:47:59,771::libvirtvm::933::vds::Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 912, in __eventCallback
    v._onLibvirtLifecycleEvent(event, detail, None)
  File "/usr/share/vdsm/libvirtvm.py", line 878, in _onLibvirtLifecycleEvent
    hooks.after_vm_start(self._dom.XMLDesc(0), self.conf)
AttributeError: 'NoneType' object has no attribute 'XMLDesc'

libvirtEventLoop::ERROR::2010-09-02 17:41:33,692::libvirtvm::933::vds::Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 912, in __eventCallback
    v._onLibvirtLifecycleEvent(event, detail, None)
  File "/usr/share/vdsm/libvirtvm.py", line 872, in _onLibvirtLifecycleEvent
    self._onQemuDeath()
  File "/usr/share/vdsm/vm.py", line 1006, in _onQemuDeath
    "Lost connection with kvm process")
  File "/usr/share/vdsm/vm.py", line 1753, in setDownStatus
    self.saveState()
  File "/usr/share/vdsm/libvirtvm.py", line 772, in saveState
    vm.Vm.saveState(self)
  File "/usr/share/vdsm/vm.py", line 1228, in saveState
    os.rename(tmpFile, self._recoveryFile)
OSError: [Errno 2] No such file or directory

this bug is consistent on latest vdsm version 4-9.14. 

repro steps: 

1) make sure you have 2 hosts 
2) make sure you have 4 running vms, 2 per host 
3) run concurrent migration so vms running on server X will be deported to server Y, and vms running on server Y will be deported to server X. 

open log.

Comment 2 Haim 2010-09-02 15:01:29 UTC
Created attachment 442648 [details]
vdsm log files of both pele and silver

Comment 3 Haim 2010-09-02 15:03:11 UTC
note that during migration 3 out of 4 vms died (from vdsmd perspective).

Comment 5 Barak 2010-11-21 16:17:18 UTC
Haim,

This probably happened due to one of the hooks issues post migration.
It was fixed long time ago.
Please try to reproduce or close.

Comment 6 Haim 2010-11-29 07:13:56 UTC
(In reply to comment #5)
> Haim,
> 
> This probably happened due to one of the hooks issues post migration.
> It was fixed long time ago.
> Please try to reproduce or close.

sorry Barak, due to libvirt bug that blocks migration (crash upon migration), i can't reproduce this bug, once i'll get a proper build with their build, i will try to reproduce.

Comment 8 Haim 2010-12-05 15:26:45 UTC
sorry - can't reproduce due to dead lock in libvirt on concurrent migration. set the bug dependencies accordingly.

Comment 9 Itamar Heim 2010-12-19 14:48:02 UTC
please try with latest libvirt again. thanks.

Comment 10 Haim 2010-12-21 13:30:09 UTC
(In reply to comment #9)
> please try with latest libvirt again. thanks.

no repro on latest libvirt, guess it was solved on the way, can either move to on_qa or closed as CURRENT_RELEASE.

Comment 11 Dan Kenigsberg 2010-12-21 14:17:32 UTC
why wait? closing after verification at comment 10.


Note You need to log in before you can comment on or make changes to this bug.