Bug 673434

Summary: Fix problems with p2p migrations
Product: Red Hat Enterprise Linux 6 Reporter: Daniel Veillard <veillard>
Component: libvirtAssignee: Daniel Veillard <veillard>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: dyuan, eblake, jyang, llim, weizhan, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.8.7-5.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 13:26:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Daniel Veillard 2011-01-28 08:09:12 UTC
Description of problem:
When doing concurrent P2P migration with the QEmu/KVM driver,
the absence of locking can deadlock the daemon, problem and
patch reported by Fujitsu:

https://www.redhat.com/archives/libvir-list/2011-January/msg00884.html


Version-Release number of selected component (if applicable):
libvirt-0.8.7-3.el6 and previous

How reproducible:


Steps to Reproduce:
1. run concurrent P2P migrations of KVM domains
2.
3.
  
Actual results:


Expected results:


Additional info:

Patch commited upstream as bda57661b8086b4d5858328afdfc28fe1b58f112
We may want to also checrry pick a couple of other patches from them
related to fixing P2P migration handling desturi errors and documentation:

https://www.redhat.com/archives/libvir-list/2011-January/msg00439.html
Patch 59d13aae329ce7d4153e5f8a7d7ec94b779a610b

https://www.redhat.com/archives/libvir-list/2011-January/msg00438.html
Patch 2fd1a2525b78adb3c2d73cd55c278462f74f4953

Comment 1 Daniel Veillard 2011-01-28 08:10:32 UTC
I think at least the first patch should really go in, this may impact RHEV too
as they use p2p migrations,

Daniel

Comment 3 weizhang 2011-02-14 10:43:58 UTC
verify pass on 
libvirt-0.8.7-6.el6.x86_64
qemu-kvm-0.12.1.2-2.144.el6.x86_64
kernel-2.6.32-113.el6.x86_64

reproduce the bug with 
libvirt-0.8.7-3.el6.x86_64
qemu-kvm-0.12.1.2-2.144.el6.x86_64
kernel-2.6.32-113.el6.x86_64

Steps:
1. Prepare 2 hosts and setting
#setsebool -P virt_use_nfs 1
on both sides
2. #iptables -F
3. mount nfs on both sides and prepare 20 guest with name mig{0..19}
4. For walking around another bug need to add hostname and ip of each other on /etc/hosts of the two hosts
5. on source host
  a. on one console do
  #while(true); do for i in {0..19}; do virsh domblkinfo mig$i /mnt/mig$i; done;done
  b. on another console do
  #sh migrate.sh
  on migrate.sh is
  #!/bin/sh
  for i in {0..19}
  do
        virsh migrate --p2p --live mig$i qemu+ssh://{dest ip}/system &
  done

Sometimes it may report error on libvirt-0.8.7-3.el6.x86_64
# virsh list
error: unable to connect to '/var/run/libvirt/libvirt-sock', libvirtd may need to be started: Connection refused
error: failed to connect to the hypervisor
# service libvirtd status
libvirtd dead but pid file exists

for libvirt-0.8.7-6.el6.x86_64 migration always succeeds and libvirtd is still running after migration. So verify pass.

Comment 6 errata-xmlrpc 2011-05-19 13:26:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0596.html