Bug 634069
Summary: | Concurrent migrate multiple guests got libvirtd errors | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Wayne Sun <gsun> | ||||||
Component: | libvirt | Assignee: | Eric Blake <eblake> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 6.1 | CC: | dallan, eblake, gren, jialiu, llim, veillard, xen-maint, yoyzhang | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-06-17 18:29:51 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 447396 [details]
/var/log/messages of target host
For Bi-directional concurrent multiple guests migration, i try with migrate 20 guests bi-directional from the 2 boxs. And also get few guests failed to migrate, 2 failed in one box and 6 failed on another. The error is the same, and there also have: error: cannot send monitor command '{"execute":"qmp_capabilities"}': Connection reset by peer Pasting here the explanation I gave in the IRC channel: [15:14] <DV> gsun: the reason is in libvrt source: daemon/libvirtd.c [15:14] <DV> static int min_workers = 5; [15:14] <DV> static int max_workers = 20; [15:14] <DV> static int max_clients = 20; [15:15] <DV> in practice we allow only 20 simulaneous connections to a given libvirt daemon [15:15] <DV> when doing a migration I think we open connections both ways [15:16] <DV> add 2 connections for virtmanager and you know why only 18 migrations suceedded [15:16] <DV> and 2 failed with no connections. So that not fixeable without increasing that value and rebuilding libvirt. Maybe we should do this ... Retargetting for 6.1 maybe we can increase the number of connections without harm Daniel actually we can raise the number of connections just from /etc/libvirt/libvirtd.conf and that's sufficient for the test: [15:18] <gsun> DV, oh, i see. But by modify libvirtd.conf can change the max clients, right? [15:18] <DV> hum [15:19] <gsun> DV, for last time i did modify it and push the migration to 40 guests and 36 success [15:19] <DV> ah yes Daniel |
Created attachment 447395 [details] /var/log/messages of source host Description of problem: When concurrent migrate multiple guests, often got this errors: libvirtd: 12:58:01.714: error : qemuMonitorJSONCommandWithFd:242 : cannot send monitor command '{"execute":"qmp_capabilities"}': Broken pipe libvirtd: 11:14:41.085: error : qemuMonitorOpenUnix:279 : monitor socket did not show up.: Connection refused libvirtd: 11:14:41.089: error : qemudWaitForMonitor:2550 : internal error process exited while connecting to monitor: char device redirected to /dev/pts/30#012inet_listen_opts: bind(ipv4,127.0.0.1,5926): Address already in use#012inet_listen_opts: FAILED#012 The error will cause the guest can't be migrated, guest got broke and need to be restart. For my test, i migrate 40 guests at the same time. And with 36 success, 4 failed. When reverse migrate the 36 guests back, got 29 success, 7 failed. So, more severe when migrate back. I also did migrate 30 and 20 guests, also got this problem. I'm using two big boxs, each with 48cpus & 500G mem. The guest is minimum rhel6 guest. Version-Release number of selected component (if applicable): RC1 build: 20100826.1 # rpm -q libvirt qemu-kvm kernel libvirt-0.8.1-27.el6.x86_64 qemu-kvm-0.12.1.2-2.113.el6.x86_64 kernel-2.6.32-71.el6.x86_64 How reproducible: Often Steps to Reproduce: 1.concurrent run "virsh migrate --live guestname qemu+ssh://address/system" 2. 3. Actual results: Concurrent migrate multiple guests with errors. Expected results: Concurrent migrate multiple guests without errors. Additional info: