Bug 1338638
| Summary: | Migration fails after ejecting the cdrom in the guest | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Dan Zheng <dzheng> |
| Component: | qemu-kvm-rhev | Assignee: | John Snow <jsnow> |
| Status: | CLOSED ERRATA | QA Contact: | FuXiangChun <xfu> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.3 | CC: | chayang, dgilbert, dyuan, dzheng, fjin, gsun, huding, juzhang, knoel, mrezanin, mzhan, ngu, qizhu, virt-maint, xfu, yduan, zpeng |
| Target Milestone: | rc | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.6.0-26.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-11-07 21:11:36 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This looks like a fun one. David: I'm trying a migrate like this: jhuston@scv ((qemu-kvm-rhev-2.6.0-5.el7)) ~/s/q/b/git> ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -cpu host -M pc -smp 4 -qmp tcp::4444,server,nowait -monitor stdio -hda /media/ext/img/f24b.qcow2 -cdrom /media/ext/iso/Fedora-Workstation-Live-x86_64-24_Beta-1.6.iso and on the receiving end: jhuston@scv ((qemu-kvm-rhev-2.6.0-5.el7)) ~/s/q/b/git> ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -cpu host -M pc -smp 4 -monitor stdio -hda /media/ext/img/f24b.qcow2 -cdrom /media/ext/iso/Fedora-Workstation-Live-x86_64-24_Beta-1.6.iso -incoming tcp:localhost:1234 And back on the source VM, via HMP: "migrate tcp:localhost:1234" Source VM: (qemu) migrate tcp:127.0.0.1:1234 (qemu) [no further output/errors. VM remains active and responsive.] Destination VM: (qemu) qemu-system-x86_64: load of migration failed: Input/output error [VM closes with no further output.] David: Any suggestions for getting better output out of this to see what's going on? Oh that is fun.
short answer: I think blk_flush_all is returning ENOMEDIUM (123)
Longer version:
I turned on all of the tracing on the loading side and found that it was failing straight after loading the RAM - I'd expected it to have tried to load the CDROM device, but no it was failing sooner.
So then I turned on all the source side tracing, and it doesn't even get to trying to save the devices.
migration/migration.c migration_completion has:
if (!ret) {
******* ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
if (ret >= 0) {
ret = bdrv_inactivate_all();
}
I added some printf's and vm_stop_force_state is returning -123, then I found it's taking the first branch through that calling vm_stop and it calls do_vm_stop which the only none-0 return path is from
ret = blk_flush_all();
return ret;
You can debug this a bit easier with just the source VM; if you do a:
migrate "exec: cat > /dev/null"
and wait until it finishes and do an 'info migrate' it shows failed for me.
Dave
Thanks for the assist, David!
Looks like this (upstream) commit in the 2.5 timeframe introduced the regression:
commit fe1a9cbc339bb54d20f1ca4c1e8788d16944d5cf
Author: Max Reitz <mreitz>
Date: Wed Mar 16 19:54:40 2016 +0100
block: Move some bdrv_*_all() functions to BB
Move bdrv_commit_all() and bdrv_flush_all() to the BlockBackend level.
Signed-off-by: Max Reitz <mreitz>
Signed-off-by: Kevin Wolf <kwolf>
Hit this issue.
Version-Release number of selected component (if applicable):
kernel 3.10.0-505.el7.x86_64
qemu-kvm-rhev 2.6.0-24.el7.x86_64
How reproducible:
100%
Steps to Reproduce:
1.Boot guest with cdrom on src host and boot guest with '-incoming' on des host.
2.Open tray of the cdrom in guest:
(qemu) info block
drive_syscd (#block110): /mnt/RHEL-7.3-20160901.1-Server-x86_64-dvd1.iso (raw, read-only)
Removable device: locked, tray closed
Cache mode: writeback, direct
drive_sysdisk (#block356): /mnt/sysdisk (qcow2)
Cache mode: writeback, direct
(qemu) eject drive_syscd
Device 'drive_syscd' is locked and force was not specified, wait for tray to open and try again
(qemu) info block
drive_syscd (#block110): /mnt/RHEL-7.3-20160901.1-Server-x86_64-dvd1.iso (raw, read-only)
Removable device: not locked, tray open
Cache mode: writeback, direct
drive_sysdisk (#block356): /mnt/sysdisk (qcow2)
Cache mode: writeback, direct
3.Start live migration.
(qemu) migrate -d tcp:$dst_host_ip:5800
{"execute": "migrate","arguments":{"uri": "tcp:$dst_host_ip:5800"}}
Actual results:
(qemu) qemu-kvm: load of migration failed: Input/output error
red_channel_client_disconnect_dummy: rcc=0x7fcc44e82000 (channel=0x7fcc46dd8aa0 type=5 id=0)
snd_channel_put: SndChannel=0x7fcc47a04000 freed
red_channel_client_disconnect_dummy: rcc=0x7fcc44df3000 (channel=0x7fcc46dd8940 type=6 id=0)
snd_channel_put: SndChannel=0x7fcc45934000 freed
red_channel_client_disconnect: rcc=0x7fcc45eb4000 (channel=0x7fcc45788600 type=2 id=0)
qemu-kvm: network script /etc/ifdown_script failed with status 256
red_channel_client_disconnect: rcc=0x7fcc44dee000 (channel=0x7fcc45777b80 type=4 id=0)
Fix under review upstream: https://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg03745.html Fix included in qemu-kvm-rhev-2.6.0-26.el7 Reproduced with qemu-kvm-rhev-2.6.0-2.el7.x86_64. Steps are exactly same as comment 0. In step 5, it prompts: # virsh migrate bug --live --verbose --unsafe qemu+ssh://10.73.72.58:22/system root.72.58's password: Migration: [ 94 %]error: internal error: qemu unexpectedly closed the monitor: main_channel_link: add main channel client inputs_connect: inputs channel client create red_dispatcher_set_cursor_peer: red_channel_client_disconnect: rcc=0x7f631dd2c000 (channel=0x7f631c1e4600 type=2 id=0) 2016-09-23T08:21:09.881350Z qemu-kvm: load of migration failed: Input/output error *************************************************************************** With qemu-kvm-rhev-2.6.0-26.el7.x86_64, migration succeeds without any error prompt. Steps are exactly same as comment 0. It is also reproduced and verified as comment 7. So this issue has been fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2673.html |
Description of problem: Migration fails after eject the cdrom in the guest. This is a regression. No problem is in qemu-kvm-rhev 2.5.0-4.el7.x86_64. And 2.6.0.1.el7.x86_64 also has this problem. Version-Release number of selected component (if applicable): kernel 3.10.0-327.el7.x86_64 qemu-kvm-rhev 2.6.0-2.el7.x86_64 libvirt 1.3.4-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Configure the guest with cdrom disk without cache='none' and start guest ok 2. Dumpxml guest # virsh dumpxml avocado-vt-vm-ci <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images2/virt_iso.img'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> 3. Log on the guest and run eject the cdrom # eject -v /dev/cdrom eject: device name is `/dev/sr0' eject: /dev/sr0: not mounted eject: /dev/sr0: is whole-disk device eject: /dev/sr0: is removable device eject: /dev/sr0: trying to eject using CD-ROM eject command eject: CD-ROM eject command succeeded 4. Check the guest XML , same with that before ejecting. # virsh dumpxml avocado-vt-vm-ci <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images2/virt_iso.img'/> <backingStore/> <target dev='hdc' bus='ide' tray='open'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> 5. Try migration of the guest # virsh migrate avocado-vt-vm-ci --live --verbose --unsafe qemu+ssh://10.66.4.167:22/system root.4.167's password: Migration: [ 97 %]error: internal error: early end of file from monitor, possible problem: warning: host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5] warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6] warning: host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5] warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6] 2016-05-20T10:06:14.489445Z qemu-kvm: load of migration failed: Input/output error 6. Check XMl again. <disk type='file' device='cdrom'> <driver name='qemu' type='raw' cache='none'/> <backingStore/> <target dev='hdc' bus='ide' tray='open'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> 7. Destroy the guest and configure the guest with cache='none', then guest start ok. 8. Dumpxml below: <disk type='file' device='cdrom'> <driver name='qemu' type='raw' cache='none'/> <source file='/var/lib/libvirt/images2/virt_iso.img'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> 9. do same steps from step 3 to step 5, Migration fails with the messages: error: internal error: process exited while connecting to monitor: 2016-05-23T07:40:03.494254Z qemu-kvm: -drive if=none,id=drive-ide0-1-0,readonly=on,cache=none: Must specify either driver or file 10. Dumpxml of the guest: <disk type='file' device='cdrom'> <driver name='qemu' type='raw' cache='none'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> Actual results: Migration command fails and guest is still running on the source host, no guest is in target host. Expected results: Migration command should succeed. Guest is shut off on source host and running on target host. Additional info: In step 5, (without cache='none') qemu log on target host: ... ***-drive if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0*** warning: host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5] warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6] warning: host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5] warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6] 2016-05-20T10:06:14.489445Z qemu-kvm: load of migration failed: Input/output error in step 9, (with cache='none') -drive if=none,id=drive-ide0-1-0,readonly=on,cache=none -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 2016-05-23T07:40:03.494254Z qemu-kvm: -drive if=none,id=drive-ide0-1-0,readonly=on,cache=none: Must specify either driver or file