| Summary: | concurrent migration with tcp connection will lose guests on target | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | weizhang <weizhan> | ||||||
| Component: | libvirt | Assignee: | Osier Yang <jyang> | ||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 6.1 | CC: | dallan, dyuan, eblake, gren, juzhang, jyang, llim, mzhan, yoyzhang | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-12-12 15:27:03 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Kyla, Could you check if there is error log on destination host? Created attachment 493473 [details]
libvirtd.log and the log of missing guest
(In reply to comment #2) > Created attachment 493473 [details] > libvirtd.log and the log of missing guest The log is on the target there is no useful log in mig5.log, but in libvirtd log, we can see domain mig5 is crashed. 20:23:12.403: 28473: debug : qemuMonitorIO:601 : Triggering EOF callback error? 0 20:23:12.403: 28476: debug : virDomainObjRef:971 : obj=0x186d210 refs=3 20:23:12.403: 28473: debug : qemuHandleMonitorEOF:741 : Received EOF on 0x18618e0 'mig5' 20:23:12.403: 28476: debug : virGetDomain:381 : New hash entry 0x7f319815ffd0 20:23:12.403: 28476: debug : qemuMonitorStartCPUs:954 : mon=0x7f318c09dfe0 20:23:12.403: 28476: debug : virJSONValueToString:1042 : object=0x7f319809cf60 20:23:12.403: 28473: debug : qemuHandleMonitorEOF:756 : Monitor connection to 'mig5' closed without SHUTDOWN event; assuming the domain crashed 20:23:12.403: 28476: debug : virJSONValueToStringOne:976 : object=0x7f319809cf60 type=0 gen=0x7f319818ac70 20:23:12.403: 28473: debug : qemudShutdownVMDaemon:3460 : Shutting down VM 'mig5' pid=28891 migrated=0 (In reply to comment #4) > there is no useful log in mig5.log, but in libvirtd log, we can see domain mig5 > is crashed. Given that, isn't this not a bug? (In reply to comment #5) > (In reply to comment #4) > > there is no useful log in mig5.log, but in libvirtd log, we can see domain mig5 > > is crashed. > > Given that, isn't this not a bug? This shouldn't be a bug of libvirt, but may be one bug of qemu, it fails on loading the domain on destination host, and shutdown the domain silently, and the testing used "live" migration, that's why the guest disappeared on destination host after migration. Do we know why the guests crashed on the dst host? Is this a simple failure to allocate memory? 20 guests @ 256MB/guest == 5GB RAM. According to the description, the box only has 4GB RAM...am I missing something? I'm a little suspicious of this BZ because the oVirt guys do this kind of test all the time and they're not reporting failures, and they're pretty vocal about that kind of thing. I have already tested on kernel-2.6.32-220.el6.x86_64 libvirt-0.9.8-1.el6.x86_64 qemu-kvm-0.12.1.2-2.209.el6.x86_64 seems can not reproduce I test on 3 cores 4G mem machine, because I use the empty guests, so I can start them successfully. (In reply to comment #12) > seems can not reproduce > > I test on 3 cores 4G mem machine, because I use the empty guests, so I can > start them successfully. Ok, I'm going to close as WORKSFORME for now. |
Created attachment 493417 [details] result guest show Description of problem: I do migration with tcp connection on machine with 4 core 4G mem. I start 20 guests and each with 1 vcpu and 256M mem. I do migration with script mig.sh #!/bin/sh for i in {1..20} do virsh migrate --live mig$i qemu+tcp://10.66.82.249/system & done after migration, the number of guests on target is less than 20. On source, there is no error reports, and all the guests are in shutoff status. Version-Release number of selected component (if applicable): libvirt-0.8.7-18.el6.x86_64 kernel-2.6.32-131.0.1.el6.x86_64 qemu-kvm-0.12.1.2-2.158.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Edit /etc/sysconfig/libvirtd LIBVIRTD_ARGS="--listen" 2. Edit /etc/libvirt/libvirtd.conf listen_tls = 0 listen_tcp=1 auth_tcp="none" 3. run #service libvirtd restart 4. mount nfs on both sides 5. do # setsebool -P virt_use_nfs 1 on both sides 6. define and start 20 guest with name mig[n] 7. run # sh mig.sh cat mig.sh #!/bin/sh for i in {1..20} do virsh migrate --live mig$i qemu+tcp://10.66.82.249/system & done Actual results: the number of guests on target is less than 20. On source, there is no error reports, and all the guests are in shutoff status. Expected results: All the guest can migrate to target host successfully. Additional info: