Bug 1086172
Summary: | Migrating Windows guest with virtio-scsi from RHEL-6.5 to 7.0 host broken for machine rhel6.3.0 or older | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | huiqingding <huding> |
Component: | virtio-win | Assignee: | Vadim Rozenfeld <vrozenfe> |
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | areis, bcao, hhuang, huding, juzhang, pbonzini, qzhang, rbalakri, rhod, sluo, virt-maint, vrozenfe, xfu |
Target Milestone: | rc | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Known Issue | |
Doc Text: |
Under certain circumstances, migrating from Red Hat Enterprise Linux 6 to a Red Hat Enterprise Linux 7 host fails with the machine type rhel6.3.0 or earlier if a virtio-scsi device is present. This problem has been observed only with a Window 8 64-bit guest machine. To work around this problem, upgrade the machine type to rhel6.4.0 or later.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2014-12-03 22:34:18 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
huiqingding
2014-04-10 09:49:54 UTC
I also test qemu-kvm-1.5.3-53.el7.x86_64, also hit this issue. So it's not a regression issue.. Boot a Win8-64 guest only with a virtio-scsi disk and set "param_change=off", can reproduece this issue, the command line is as following: /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu Westmere,hv_relaxed -enable-kvm -m 2048 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -drive file=/mnt/win8-64.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x13,indirect_desc=on,event_idx=off,hotplug=on,num_queues=1,max_sectors=512,cmd_per_lun=16,multifunction=on,rombar=64,param_change=off -device scsi-hd,drive=drive-system-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk2,logical_block_size=512,physical_block_size=512,min_io_size=512,opt_io_size=512,discard_granularity=512,ver=fuxc-scsi,serial=fuxc-scsi-serial,removable=off,wwn=0x16,channel=0,scsi-id=2,lun=0,bootindex=0 -vnc :10 -monitor stdio -nodefconfig -net none Fails with a Win8-64 guest, but not a RHEL-6.5 guest. Intriguing. param_change isn't meant to be used on the command line, it's for the machine type compatibility machinery only. Machine type rhel6.3.0 and older set param_change off. Newer types set param_change on. Could you please re-run your reproducer without setting param_change on the command line? At least machine type rhel6.3.0 and rhel6.5.0, but preferably all of them. At least with your Win8-64 guest, but preferably also with your RHEL-6.5 guest. Running all these tests may take time. Partial test results could help me; so if getting full results take an extra day or more, post partial results as they become available. I guess you'll observe failure for rhel6.3.0 and older, exactly like you did with an explicit param_change=off, and success for newer machine types, exactly like you did with param_change=on. Thanks in advance!
>
> Could you please re-run your reproducer without setting param_change on the
> command line? At least machine type rhel6.3.0 and rhel6.5.0, but preferably
> all of them. At least with your Win8-64 guest, but preferably also with
> your RHEL-6.5 guest.
>
I migrate win8-64 guest and rhel6.5-64 guest from rhel6.5 host to rhel7.0 host without setting param_change on the command line. The results are as following:
guest: win8-64
host machine type result
rhel6.5->rhel7.0 -M rhel6.1.0 migration failed[1]
rhel6.5->rhel7.0 -M rhel6.2.0 migration failed[1]
rhel6.5->rhel7.0 -M rhel6.3.0 migration failed[1]
rhel6.5->rhel7.0 -M rhel6.4.0 migration success[2]
rhel6.5->rhel7.0 -M rhel6.5.0 migration success[2]
guest: rhel6.5-64
host machine type result
rhel6.5->rhel7.0 -M rhel6.1.0 migration success[2]
rhel6.5->rhel7.0 -M rhel6.2.0 migration success[2]
rhel6.5->rhel7.0 -M rhel6.3.0 migration success[2]
rhel6.5->rhel7.0 -M rhel6.4.0 migration success[2]
rhel6.5->rhel7.0 -M rhel6.5.0 migration success[2]
[1] the dest qemu-kvm quits with error info
(qemu) qemu-kvm: Features 0x6 unsupported. Allowed features: 0x51000002
qemu: warning: error while loading state for instance 0x0 of device '0000:00:13.0/virtio-scsi'
load of migration failed
[2] there is no error in the dest qemu-kvm. in the dest qemu-kvm:
(qemu) info status
VM status: running
The command line used is as following:
/usr/libexec/qemu-kvm -M rhelxxx -cpu Westmere,hv_relaxed -enable-kvm -m 2048 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -drive file=/mnt/RHEL-Server-6.5-64.qcow2.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x13,indirect_desc=on,event_idx=off,hotplug=on,num_queues=1,max_sectors=512,cmd_per_lun=16,multifunction=on,rombar=64 -device scsi-hd,drive=drive-system-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk2,logical_block_size=512,physical_block_size=512,min_io_size=512,opt_io_size=512,discard_granularity=512,ver=fuxc-scsi,serial=fuxc-scsi-serial,removable=off,wwn=0x16,channel=0,scsi-id=2,lun=0,bootindex=0 -vnc :10 -monitor stdio -nodefconfig -net none
That was quick; thanks! To assess the severity of this bug, we need to know whether it bites only with Windows 8. Could you please try one of the failing test cases, say rhel6.5->rhel7.0 -M rhel6.3.0, with a Windows 7 guest? Thanks Markus. Hi Huding, Can you have a try and update the issue in the bz? Best Regards, Junyi (In reply to Markus Armbruster from comment #8) > To assess the severity of this bug, we need to know whether it bites only > with Windows 8. Could you please try one of the failing test cases, say > rhel6.5->rhel7.0 -M rhel6.3.0, with a Windows 7 guest? I use the commd line of comment6 to test windows 7 32bit and 64bit guest, also hit this issue. host guest machine type result rhel6.5->rhel7.0 win7-32 -M rhel6.3.0 migration failed[1] rhel6.5->rhel7.0 win7-64 -M rhel6.3.0 migration failed[1] [1] the dest qemu-kvm quits with error info (qemu) qemu-kvm: Features 0x6 unsupported. Allowed features: 0x51000002 qemu: warning: error while loading state for instance 0x0 of device '0000:00:13.0/virtio-scsi' load of migration failed virtio-scsi with -rhel6.3.0 isn't supported. Closing BZ. (In reply to Andrew Cathrow from comment #11) > virtio-scsi with -rhel6.3.0 isn't supported. It's ok in QE side if we could make sure the real customer can not hit this problem. Best Regards, Junyi > > Closing BZ. Additional information on "isn't supported": virtio-scsi came out of tech preview in RHEL-6.4. We don't support it with older machine types even on newer hosts. Example: Charlie decides to check out the virtio-scsi tech preview in his brand new RHEL-6.3 host. He creates a RHEL-6.3 guest and a Windows 8 guest using it, and everything works fine. Time passes, RHEL-6.4+ comes out. Charlie upgrades his host and guests to the latest software. Everything still works fine. Charlie puts the guests into production. Time passes, RHEL-7.0 comes out. Charlie installs it on a new host, then attempts to migrate his guests. Migrating the RHEL-6 guest succeeds, but the Windows 8 guest fails. This is not supported. Charlie should have upgraded his *machine type* in addition to his host & guest software before putting his guests into production. Re comment#12: I have no idea whether customers are prone to add virtio-scsi devices to old machine types. I don't know how to best "make sure the real customer can not hit this problem". The migration problem exists because we go out of our way to preserve ABI in an unsupported case: we switch param_change off for machine types rhel6.3.0 and older. Looks like this creates as much of a problem as it solves. And both the created and solved problem are with unsupported usage. Makes me wonder whether compatibility properties for unsupported (machine type, device, property) triples make sense at all. (In reply to Markus Armbruster from comment #13) > Additional information on "isn't supported": virtio-scsi came out of > tech preview in RHEL-6.4. We don't support it with older machine > types even on newer hosts. > > Example: Charlie decides to check out the virtio-scsi tech preview in > his brand new RHEL-6.3 host. He creates a RHEL-6.3 guest and a > Windows 8 guest using it, and everything works fine. > > Time passes, RHEL-6.4+ comes out. Charlie upgrades his host and > guests to the latest software. Everything still works fine. Charlie > puts the guests into production. > > Time passes, RHEL-7.0 comes out. Charlie installs it on a new host, > then attempts to migrate his guests. Migrating the RHEL-6 guest > succeeds, but the Windows 8 guest fails. > > This is not supported. Charlie should have upgraded his *machine > type* in addition to his host & guest software before putting his > guests into production. Make sense and thx for your extra explanations. Best Regards, Junyi (In reply to Markus Armbruster from comment #14) > Re comment#12: I have no idea whether customers are prone to add > virtio-scsi devices to old machine types. I don't know how to best > "make sure the real customer can not hit this problem". Can we document it in release notes or some place else? > > The migration problem exists because we go out of our way to preserve > ABI in an unsupported case: we switch param_change off for machine > types rhel6.3.0 and older. Looks like this creates as much of a > problem as it solves. Every coin has 2 sides. Best Regards, Junyi And both the created and solved problem are > with unsupported usage. > > Makes me wonder whether compatibility properties for unsupported > (machine type, device, property) triples make sense at all. The root cause of the bug is still unknown. We might want to figure it out, just to make sure there are no surprises. I'm leaving that decision to Paolo. Doc text change: s/requires a reboot/requires restarting the VM (guest reboot)/ No, it should be the same virtio-win bug. |