Bug 1250357
Summary: | KVM: entry failed, hardware error 0x80000021 after migration from SandyBridge to Penryn host. | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Qian Guo <qiguo> | |
Component: | qemu-kvm-rhev | Assignee: | Radim Krčmář <rkrcmar> | |
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 7.2 | CC: | amit.shah, bdas, dgilbert, juzhang, knoel, michen, qiguo, quintela, rkrcmar, virt-maint, weliao | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1254480 (view as bug list) | Environment: | ||
Last Closed: | 2015-09-10 14:59:59 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1254480 |
Comment 2
Qian Guo
2015-08-18 08:34:04 UTC
Forgot to say that if migrate from SandyBridge host to the Penryn host, migration works smoothly. Since Radim is offline and far away in an unknown land, I took a quick look. So, it seems that by specifying "-cpu Penryn,+fsgsbase", the guest (Windows10) assumes cpuid[FEAT_7_0_EBX] is valid and trying to use unsupported instructions. Actually, my guess is it might be trying to use clflushopt just like in bug 1223317. Does this happen only when migrating ? It seems just running a Windows 10 guest on the Penryn host with "-cpu Penryn,+fsgsbase" should be enough to hit this. Qian, can you please confirm ? (In reply to Bandan Das from comment #9) > Since Radim is offline and far away in an unknown land, I took a quick look. > > So, it seems that by specifying "-cpu Penryn,+fsgsbase", the guest > (Windows10) assumes cpuid[FEAT_7_0_EBX] is valid and trying to use > unsupported instructions. Actually, my guess is it might be trying to use > clflushopt just like in bug 1223317. > > Does this happen only when migrating ? It seems just running a Windows 10 > guest on the Penryn host with "-cpu Penryn,+fsgsbase" should be enough to > hit this. Qian, can you please confirm ? Hi, Bandan Windows 10 works well in Penryn host and boot with "-cpu Penryn,+fsgsbase", and migration from Penryn to SandyBridge works well. Thanks, Qian (In reply to Qian Guo from comment #10) > (In reply to Bandan Das from comment #9) > > Since Radim is offline and far away in an unknown land, I took a quick look. > > > > So, it seems that by specifying "-cpu Penryn,+fsgsbase", the guest > > (Windows10) assumes cpuid[FEAT_7_0_EBX] is valid and trying to use > > unsupported instructions. Actually, my guess is it might be trying to use > > clflushopt just like in bug 1223317. > > > > Does this happen only when migrating ? It seems just running a Windows 10 > > guest on the Penryn host with "-cpu Penryn,+fsgsbase" should be enough to > > hit this. Qian, can you please confirm ? > > Hi, Bandan > > Windows 10 works well in Penryn host and boot with "-cpu Penryn,+fsgsbase", > and migration from Penryn to SandyBridge works well. > > Thanks, > Qian Ok, so it's basically what Dave said in comment 6. Since fsgsbase is valid on SandyBridge, Windows 10 assumes certain instructions as supported and blows out on the Penryn host. Can you try one more test -Migrate from Penryn to SandyBridge and then back. On the return path, the migration from SandyBridge to the Penryn host should succeed. (In reply to Bandan Das from comment #11) > (In reply to Qian Guo from comment #10) > > (In reply to Bandan Das from comment #9) > > > Since Radim is offline and far away in an unknown land, I took a quick look. > > > > > > So, it seems that by specifying "-cpu Penryn,+fsgsbase", the guest > > > (Windows10) assumes cpuid[FEAT_7_0_EBX] is valid and trying to use > > > unsupported instructions. Actually, my guess is it might be trying to use > > > clflushopt just like in bug 1223317. > > > > > > Does this happen only when migrating ? It seems just running a Windows 10 > > > guest on the Penryn host with "-cpu Penryn,+fsgsbase" should be enough to > > > hit this. Qian, can you please confirm ? > > > > Hi, Bandan > > > > Windows 10 works well in Penryn host and boot with "-cpu Penryn,+fsgsbase", > > and migration from Penryn to SandyBridge works well. > > > > Thanks, > > Qian > > Ok, so it's basically what Dave said in comment 6. Since fsgsbase is valid > on SandyBridge, Windows 10 assumes certain instructions as supported and > blows out on the Penryn host. Can you try one more test -Migrate from Penryn > to SandyBridge and then back. On the return path, the migration from > SandyBridge to the Penryn host should succeed. Hi, Bandan When I reported this bug, I was doing this ping-pong migration, so it failed to migrate from SandyBridge to Penryn even first from Penryn to SandyBridge. Thanks, Qian (In reply to Qian Guo from comment #12) > Hi, Bandan > > When I reported this bug, I was doing this ping-pong migration, so it failed > to migrate from SandyBridge to Penryn even first from Penryn to SandyBridge. I am not sure I understand. Are you saying it fails migrating from Penryn to SandyBridge too ? Please confirm if the test I mentioned in comment 11 fails with a recent build that contains the fix for bug 1223317. > Thanks, > Qian (In reply to Bandan Das from comment #13) > (In reply to Qian Guo from comment #12) > > Hi, Bandan > > > > When I reported this bug, I was doing this ping-pong migration, so it failed > > to migrate from SandyBridge to Penryn even first from Penryn to SandyBridge. > > I am not sure I understand. Are you saying it fails migrating from Penryn to > SandyBridge too ? No, I means after migration from Penryn to SandyBridge(works well), then hit the issue once migration back. > > Please confirm if the test I mentioned in comment 11 fails with a recent > build that contains the fix for bug 1223317. > Will try again, and updated here. > > Thanks, > > Qian Hi, Bandan Sorry for long time response, since the hosts are doing other tests. Retest with latest builds qemu-kvm-rhev-2.3.0-22.el7.x86_64, and yes you are right, with the latest builds, if fist migration from Penryn to SandyBrdige(works well), then migrate back, guest works well, and I test ping-pong for 10 times, the issue gone. And if migrate from SandyBridge to Penryn as the first time, it will crash The cli are same as above comments that with +fsgsbase. Thanks, Qian. OK, thanks for confirming. This isn't a bug, because: a) Penryn doesn't support fsgsbase - and so trying to use a feature on a CPU that doesn't have it may break if the OS tries to use it. b) 'enforce' correctly stops you from doing (a) *** Bug 1254480 has been marked as a duplicate of this bug. *** |