RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 634069 - Concurrent migrate multiple guests got libvirtd errors
Summary: Concurrent migrate multiple guests got libvirtd errors
Keywords:
Status: CLOSED DUPLICATE of bug 692663
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Eric Blake
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-15 06:44 UTC by Wayne Sun
Modified: 2011-06-17 18:29 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-17 18:29:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
/var/log/messages of source host (2.50 MB, text/plain)
2010-09-15 06:44 UTC, Wayne Sun
no flags Details
/var/log/messages of target host (1.19 MB, text/plain)
2010-09-15 06:46 UTC, Wayne Sun
no flags Details

Description Wayne Sun 2010-09-15 06:44:40 UTC
Created attachment 447395 [details]
/var/log/messages of source host

Description of problem:
When concurrent migrate multiple guests, often got this errors:

libvirtd: 12:58:01.714: error : qemuMonitorJSONCommandWithFd:242 : cannot send monitor command '{"execute":"qmp_capabilities"}': Broken pipe

libvirtd: 11:14:41.085: error : qemuMonitorOpenUnix:279 : monitor socket did not show up.: Connection refused
libvirtd: 11:14:41.089: error : qemudWaitForMonitor:2550 : internal error process exited while connecting to monitor: char device redirected to /dev/pts/30#012inet_listen_opts: bind(ipv4,127.0.0.1,5926): Address already in use#012inet_listen_opts: FAILED#012

The error will cause the guest can't be migrated, guest got broke and need to be restart. 

For my test, i migrate 40 guests at the same time. And with 36 success, 4 failed. When reverse migrate the 36 guests back, got 29 success, 7 failed. So, more severe when migrate back.
I also did migrate 30 and 20 guests, also got this problem.

I'm using two big boxs, each with 48cpus & 500G mem. The guest is minimum rhel6 guest.

Version-Release number of selected component (if applicable):
RC1 build: 20100826.1
# rpm -q libvirt qemu-kvm kernel
libvirt-0.8.1-27.el6.x86_64
qemu-kvm-0.12.1.2-2.113.el6.x86_64
kernel-2.6.32-71.el6.x86_64

How reproducible:
Often

Steps to Reproduce:
1.concurrent run "virsh migrate --live guestname qemu+ssh://address/system"
2.
3.
  
Actual results:
Concurrent migrate multiple guests with errors.

Expected results:
Concurrent migrate multiple guests without errors.

Additional info:

Comment 1 Wayne Sun 2010-09-15 06:46:50 UTC
Created attachment 447396 [details]
/var/log/messages of target host

Comment 3 Wayne Sun 2010-09-15 07:51:12 UTC
For Bi-directional concurrent multiple guests migration, i try with migrate 20 guests bi-directional from the 2 boxs. And also get few guests failed to migrate, 2 failed in one box and 6 failed on another. The error is the same, and there also have:
error: cannot send monitor command '{"execute":"qmp_capabilities"}': Connection reset by peer

Comment 4 Daniel Veillard 2011-01-12 07:18:36 UTC
Pasting here the explanation I gave in the IRC channel:

[15:14] <DV> gsun: the reason is in libvrt source: daemon/libvirtd.c
[15:14] <DV> static int min_workers = 5;
[15:14] <DV> static int max_workers = 20;
[15:14] <DV> static int max_clients = 20;
[15:15] <DV> in practice we allow only 20 simulaneous connections to a given libvirt daemon
[15:15] <DV> when doing a migration I think we open connections both ways
[15:16] <DV> add 2 connections for virtmanager and you know why only 18 migrations suceedded
[15:16] <DV> and 2 failed with no connections.

So that not fixeable without increasing that value and rebuilding libvirt.
Maybe we should do this ...
Retargetting for 6.1 maybe we can increase the number of connections without
harm

Daniel

Comment 5 Daniel Veillard 2011-01-12 07:30:11 UTC
actually we can raise the number of connections just from
  /etc/libvirt/libvirtd.conf

and that's sufficient for the test:

[15:18] <gsun> DV, oh, i see. But by modify libvirtd.conf can change the max clients, right?
[15:18] <DV> hum
[15:19] <gsun> DV, for last time i did modify it and push the migration to 40 guests and 36 success
[15:19] <DV> ah yes

Daniel


Note You need to log in before you can comment on or make changes to this bug.