Bug 1088216
Summary: | Fail restore/migrate after hot-unplug the vcpus from guest | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Xuesong Zhang <xuzhang> |
Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> |
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 6.6 | CC: | acathrow, bsarathy, chayang, dallan, dgilbert, dyuan, ehabkost, imammedo, juzhang, lersek, michen, mkenneth, mzhan, quintela, qzhang, virt-maint, xfu, zpeng |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-06-12 13:37:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Xuesong Zhang
2014-04-16 09:10:49 UTC
The migration will failed if hot-unplug vcpus, no need migrate back. error msg: Migration: [ 96 %]error: operation failed: domain is no longer running Another scenario will be failed, since same reason with this bug. 1. hot-unplug the vcpus from running guest 2. save the guest to file guest.save 3. restore the guest from file guest.save, will meet the following error: # virsh restore rhel6.5.save3 error: Failed to restore domain from rhel6.5.save3 error: Unable to read from monitor: Connection reset by peer Following is the log info: qemu guest log, libvirtd log. # tailf /var/log/libvirt/qemu/rhel6.5.log 2014-04-22 08:44:51.503+0000: starting up LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name rhel6.5 -S -M rhel6.5.0 -M rhel6.5.0 -enable-kvm -m 1024 -realtime mlock=off -smp-smp 1,maxcpus=4,sockets=4,cores=1,threads=11,maxcpus=4,sockets=4,cores=1,threads=1 -uuid fe380b68-11c6-b7d0-e6a8-b466823497f8 -nodefconfig -nodefaults -chardev-chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel6.5.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/var/lib/libvirt/images/rhel6.5-20140306.img,if=none,id=drive-ide0-0-0,format=raw,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=27,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:7a:c3:47,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/rhel6.5.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -incoming fd:25 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 char device redirected to /dev/pts/6 Unknown savevm section or instance 'cpu_common' 1 load of migration failed 2014-04-22 08:44:52.977+0000: shutting down # tailf /var/log/libvirt/libvirtd.log 2014-04-22 08:44:52.977+0000: 2325: error : qemuMonitorIORead:514 : Unable to read from monitor: Connection reset by peer 2014-04-22 08:44:53.353+0000: 2326: warning : qemuDomainSaveImageStartVM:5524 : failed to restore save state label on /root/rhel6.5.save3 Laszlo, I think this is one of the bugs that's somewhere in between libvirt and qemu. I mean, from my understanding the problem is libvirt does not create the same command line on the destination (prior to vcpu hotunplug: -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 on the dst after the hotunplug: -smp 1,maxcpus=4,sockets=4,cores=1,threads=1). However, the other solution might be that qemu doesn't transfer the stale vcpu thread state to the destination. I mean, in this particular case, when vcpu#2 is unplugged, the vcpu thread might be joined with qemu resulting in no ABI breakage. What's your opinion on this? This is the key message, from comment 2, on the target host: > Unknown savevm section or instance 'cpu_common' 1 This corresponds to the VMStateDescription object called "vmstate_cpu_common", in file "exec.c", which is instantiated for each VCPU. On the source host, you start with -smp 2, hence two VCPUs are created (each with its VCPU thread etc). When you unplug one, the VCPU object and the thread stay, the VCPU just gets disabled in ACPI. (See bug 1017858 comment 30.) However the VCPU object (disabled in ACPI) will be part of the migration stream nonetheless. When you start qemu-kvm with -smp 1 on the target host, apparently no VCPU object exists with instance-id==1, to load the vmstate into. VCPUs are created on startup in: pc_init1() [hw/pc.c] LOOP: 0 <= i < smp_cpus pc_new_cpu() cpu_init() [target-i386/cpu.h] cpu_x86_init() [target-i386/helper.c] cpu_exec_init() [exec.c] vmstate_register(... &vmstate_cpu_common ...) <--- "cpu_common" qemu_init_vcpu() [vl.c] ... Unfortunately, joining (ie. killing) the VCPU thread, and releasing the VCPU object on hot-unplug (in addition to disabling it in ACPI) is out of scope for RHEL-6. I'm not sure how far Igor (CC'd) got with this feature in upstream, but it's too intrusive for RHEL-6 I think. One thing we might entertain is a vmstate_unregister() call on VCPU hot-unplug (ie. at the time it is disabled in ACPI), in the disable_processor() branch of qemu_system_cpu_hot_add() [hw/acpi.c]. Mirroring that, the enable_processor() branch of the same function should re-register the vmstate, but *only* when we reenable (re-plug) a preexistent VCPU (object & thread), not when we create a brand new one. There's no telling of course what else this would regress :( From a quick look, I don't think it would cause the cpu_index space to "collapse". That is, if you omit the "cpu_common" VMSD with instance_id==1 from the stream, that still allows other such VMSDs to keep their instance IDs (which are the cpu_index values for VCPUs), hence probably keeping the mapping to topology information (APIC IDs) intact as well. But this reasoning is hardly a hard proof :( I guess I could hack up a proof of concept patch for the vmstate_unregister()-on-unplug idea, but it's really not my expertise. --------o-------- Regarding the reverse direction (ie. migrating from a lower SMP count to a higher one) -- I'm not overly familiar with migration, but I think in this case the VCPU states that are not present in the migration stream are simply not loaded in memory -- those VCPUs keep their original (halted) state from the time the target qemu instance was started. Does this help?...:( @Igor, so what's your thought on this? Would it be possible just to not send offline VCPU state to the other side? @Laszlo, well if qemu fix turns out too invasive is there something libvirt can do? For instance, start domain with '-smp 2' but prior issuing 'cont' do vcpu hotunplug? What happens if you start the incoming domain with -smp 2, and don't do anything else? Basically, start the incoming domain with -smp N, where N is the maximum number of VCPUs that the source domain has ever seen during its lifetime. (That's the number of the source's VCPU threads / objects.) If the source domain provides a vmsate object in the stream for each of the N VCPUs, that's best; if not (because some have been unplugged on the source before migration), then those VCPUs will remain parked on the target. (In reply to Michal Privoznik from comment #5) > @Igor, so what's your thought on this? Would it be possible just to not send > offline VCPU state to the other side? That would be too ugly and without actually trying to implement all Laslo said above it's hard to say if anything might regress. > > @Laszlo, well if qemu fix turns out too invasive is there something libvirt > can do? For instance, start domain with '-smp 2' but prior issuing 'cont' do > vcpu hotunplug? Starting destination domain with the maximum amount of VCPUs as source have ever had should work. But to cross out all above said, why do we care about vpcu-unplug on RHEL6 if it's not supported? (and I'm not aware that we are going to support it there at all) Perhaps it would be easy to just disable in libvirt decreasing VCPU count and be done with it. |