Bug 836209 - Can not migrate several guests concurrently
Can not migrate several guests concurrently
Status: CLOSED DUPLICATE of bug 827050
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.3
x86_64 Linux
low Severity low
: rc
: ---
Assigned To: Jiri Denemark
Virtualization Bugs
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-28 07:20 EDT by EricLee
Modified: 2012-07-04 03:45 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-04 03:45:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/libvirt/qmeu/mig-0.log (3.26 KB, text/x-log)
2012-06-28 07:20 EDT, EricLee
no flags Details
libvirtd-source.log (2.06 KB, text/x-log)
2012-06-28 07:21 EDT, EricLee
no flags Details
libvirtd-target.log (1.16 KB, text/x-log)
2012-06-28 07:21 EDT, EricLee
no flags Details

  None (edit)
Description EricLee 2012-06-28 07:20:40 EDT
Created attachment 594993 [details]
/var/log/libvirt/qmeu/mig-0.log

Description of problem:
Can not migrate several guests concurrently

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.295.el6.x86_64
libvirt-0.9.10-21.el6.x86_64

How reproducible:
60%

Steup:
1. Prepare 2 hosts and prepare a nfs which is mounted on both hosts, and setting the virt_use_nfs boolean on both sides

  # setsebool -P virt_use_nfs 1

   and close the iptable on both sides

  # iptables -F

2. Prepare 10 guests on each side (20 total) with shared image on nfs named mig-0 ~ mig-9 on source and mig-10 ~ mig-19 on target

3. On both side, setting tcp connection environment

a. Edit /etc/sysconfig/libvirtd
       LIBVIRTD_ARGS="--listen"

b. Edit /etc/libvirt/libvirtd.conf
       listen_tls = 0
       listen_tcp=1
       auth_tcp="none"

c. # service libvirtd restart

4. Prepare p2p migration environment on source and target hosts.
	On source:
	  # ssh-copy-id -i ~/.ssh/id_rsa.pub root@souce-ip
	On target:
	  # ssh-copy-id -i ~/.ssh/id_rsa.pub root@target-ip

Actions:
1.Start all the guest on both sides

   on source:

   # for i in {0..9}; do virsh start mig$i; done

   on target:

   # for i in {10..19}; do virsh start mig$i; done

2. Do migration on source with commands:
   
   on source:
   # for i in {0..9}; do virsh migrate --live mig-$i qemu+tcp://10.66.6.38/system --verbose & echo mig-$i; done
   
   And to the best of your abilities to run following command at the same time,
   On target:
   # for i in {0..9}; do virsh migrate --live mig-1$i qemu+tcp://10.66.5.143/system --verbose & echo mig-1$i; done

Actual results:
	There are some guests could not migrate to target with errors like:
	error: internal error Invalid file descriptor while waiting for monitor

	error: internal error Unable to set monitor close-on-exec flag

	error: Unable to write to monitor: Bad file descriptor

	error: An error occurred, but the cause is unknown

	error: internal error Error while processing monitor IO

	error: error: internal error Unknown JSON reply '{"execute":"qmp_capabilities","id":"libvirt-1"}'
	Unable to read from monitor: Bad file descriptor

	After migration:
	On target:
	# virsh list 
	 Id    Name                           State
	----------------------------------------------------
	 1     mig-10                         running
	 2     mig-11                         running
	 4     mig-13                         running
	 5     mig-14                         running
	 6     mig-15                         running
	 8     mig-17                         running
	 9     mig-18                         running
	 10    mig-19                         running
	 11    mig-0                          running
	 12    mig-8                          running
	 18    mig-9                          running
	 19    mig-1                          running
	 20    mig-4                          running

	On source:
	# virsh list 
	 Id    Name                           State
	----------------------------------------------------
	 3     mig-2                          running
	 4     mig-3                          running
	 6     mig-5                          running
	 7     mig-6                          running
	 8     mig-7                          running
	 12    mig-12                         running
	 20    mig-16                         running

Expected results:
	Migration succeed without error, and all 20 guests migrate to destination respectively.

Additional info:
	I will attach the source and target libvirtd.log and also for a /var/log/libvirt/qemu/$guest.log(almost all failed guest.log are same).
	
	I can only reproduce the bug in some special environment, but not sure what is the ultimate reason for this bug.
	
	And get same result using qemu+ssh to migrate the guests.
Comment 1 EricLee 2012-06-28 07:21:22 EDT
Created attachment 594994 [details]
libvirtd-source.log
Comment 2 EricLee 2012-06-28 07:21:57 EDT
Created attachment 594995 [details]
libvirtd-target.log
Comment 4 Eric Blake 2012-06-28 07:29:40 EDT
(In reply to comment #0)

> libvirt-0.9.10-21.el6.x86_64

Known double close bugs (see bug 827050) have this symptom (among others):

> Actual results:
> 	There are some guests could not migrate to target with errors like:
> 	error: internal error Invalid file descriptor while waiting for monitor
> 
> 	error: internal error Unable to set monitor close-on-exec flag

Retry your test with 0.9.10-21.el6_3.1 or newer; for now, I'm closing this as duplicate, but we can reopen if you can reproduce with a version of libvirt without the double-close bugs.

*** This bug has been marked as a duplicate of bug 827050 ***
Comment 5 EricLee 2012-06-28 07:52:42 EDT
(In reply to comment #4)
> (In reply to comment #0)
> 
> > libvirt-0.9.10-21.el6.x86_64
> 
> Known double close bugs (see bug 827050) have this symptom (among others):
> 
> > Actual results:
> > 	There are some guests could not migrate to target with errors like:
> > 	error: internal error Invalid file descriptor while waiting for monitor
> > 
> > 	error: internal error Unable to set monitor close-on-exec flag
> 
> Retry your test with 0.9.10-21.el6_3.1 or newer; for now, I'm closing this
> as duplicate, but we can reopen if you can reproduce with a version of
> libvirt without the double-close bugs.
> 
> *** This bug has been marked as a duplicate of bug 827050 ***

I can reproduce the bug with 0.9.10-21.el6_3.1, get errors like:
error: internal error Error while processing monitor IO

error: An error occurred, but the cause is unknown

error: Unable to read from monitor: Bad file descriptor

error: internal error Invalid file descriptor while waiting for monitor

error: internal error Error while processing monitor IO

error: internal error Invalid file descriptor while waiting for monitor
Comment 6 EricLee 2012-07-03 02:04:22 EDT
According to Comment 5, the bug is not only duplicate with bug 827050, so I think we should reopen it to track. 
How do you think, Blake ?
Comment 7 Dave Allan 2012-07-03 09:49:28 EDT
EricLee, you need to set needinfo for people when you ask them questions.  I agree that it should be reopened, so I have done so.
Comment 8 Jiri Denemark 2012-07-03 15:40:27 EDT
Yeah, this really looks like the double clouse bugs we fixed. Could you attach libvirtd logs generated with libvirt-0.9.10-21.el6_3.1?
Comment 9 EricLee 2012-07-03 22:15:49 EDT
(In reply to comment #7)
> EricLee, you need to set needinfo for people when you ask them questions.  I
> agree that it should be reopened, so I have done so.

Thanks for your reminding, Dave.
Comment 10 EricLee 2012-07-03 22:28:05 EDT
(In reply to comment #8)
> Yeah, this really looks like the double clouse bugs we fixed. Could you
> attach libvirtd logs generated with libvirt-0.9.10-21.el6_3.1?

Hi Jiri,

I have to told that I can not reproduce the bug with libvirt-0.9.10-21.el6_3.1 this time , it seems work well, and I don't known why I can reproduce the bug last time. But also can reproduce with libvirt-0.9.10-21.el6 again and again.

So I think maybe you and Blake are right. It is really the double clouse bug fixed in libvirt-0.9.10-21.el6_3.1. We can CLOSE it as a duplicate of bug 827050.

Thanks for your patience.

EricLee
Comment 11 Jiri Denemark 2012-07-04 03:45:22 EDT
I suspect libvirtd wasn't properly restarted for some reason after installing the new version of the package. It's great you cannot reproduce it anymore :-)

*** This bug has been marked as a duplicate of bug 827050 ***

Note You need to log in before you can comment on or make changes to this bug.