Bug 1269828
Summary: | [windows 10] [3.6 engine/3.5 cluster] Windows 10 are restarting - SYSTEM_THREAD_EXCEPTION_NOT_HANDLES | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Jiri Belka <jbelka> | ||||||
Component: | Frontend.WebAdmin | Assignee: | jniederm | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Nisim Simsolo <nsimsolo> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.6.0 | CC: | bugs, ehabkost, gklein, jbelka, jdenemar, jniederm, knoel, lijin, mavital, mgoldboi, michal.skrivanek, michen, nsimsolo, sbonazzo, tjelinek | ||||||
Target Milestone: | ovirt-3.5.7 | Keywords: | ZStream | ||||||
Target Release: | 3.5.7 | Flags: | rule-engine:
ovirt-3.5.z+
ylavi: planning_ack+ michal.skrivanek: devel_ack+ mavital: testing_ack+ |
||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | virt | ||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-12-17 09:06:34 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1277214, 1288090 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Do we support Windows 10 only with certain machine types? RHEV 3.5 here means rhel_6.5.0 type, 3.6 would be the latest 7.2 machine type... (In reply to Michal Skrivanek from comment #2) > Do we support Windows 10 only with certain machine types? > RHEV 3.5 here means rhel_6.5.0 type, 3.6 would be the latest 7.2 machine > type... Yes, I believe Win10 is only supported with the rhel-7.2.0 machine type. Eduardo? There is also a -cpu flag workaround. Add +fsgsbase. There is a way to do this with libvirt. Jirka? I'm not sure if it works in this case, though. I'm curious if RHEV allows using this libvirt workaround. Thanks. Yes, the host has Xeon 5507, Gainestown serie based on Nehalem microarchitecture (thx wikipedia). I'm going to search for Westmere host. So windows 10 starts OK on a host (rhel 6.7 with 3.5 vdsm) in 3.5 cluster with Westmere CPU module. (Anyway, I think there would be some check if one can specific OS type put into a specific cluster.) And what was repaired? What behaviour should we expect? currently you can not run a windows 10 VM on any of this CPUs: conroe, penryn, nehalem, opteron_g1, opteron_g2, opteron_g3, opteron_g4, opteron_g5 - Reassigned. cluster level cannot be changed to 3.5. Rejected with the next webadmin message: Error while executing action: Cannot decrease data center compatibility version Relevant engine logs: 2015-11-05 15:37:58,288 WARN [org.ovirt.engine.core.bll.UpdateVdsGroupCommand] (ajp-/127.0.0.1:8702-5) [1d402ae3] CanDoAction of action 'UpdateVdsGroup' failed for user admin@internal. Reasons: VAR__TYPE__CLUSTER,VAR__ACTION__UPDATE,ACTION_TYPE_FAILED_CANNOT_DECREASE_COMPATIBILITY_VERSION 2015-11-05 15:37:58,571 ERROR [org.ovirt.engine.core.bll.host.provider.foreman.SystemProviderFinder] (ajp-/127.0.0.1:8702-3) [] Failed to find host on any provider by host name 'dhcp163-68.scl.lab.tlv.redhat.com' 2015-11-05 15:37:59,289 ERROR [org.ovirt.engine.core.bll.host.provider.foreman.SystemProviderFinder] (ajp-/127.0.0.1:8702-4) [] Failed to find host on any provider by host name 'dhcp163-68.scl.lab.tlv.redhat.com' Scenario: 1. Change data center compatibility version to 3.5 2. Change cluster compatibility version to 3.5 and click ok. Actual result: Action rejected by webadmin. Issue is relevant for both AMD and Intel CPU types (Verified on AMD Opteron G5 and Intel Nehalem). If https://bugzilla.redhat.com/show_bug.cgi?id=1269828#c12 is the bug fix, then an appropriate error message should appear in webadmin instead of "cannot decrease data center compatibility version" without an explanation. Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release. BTW, I verified if Win10 VM can be move from 3.6 cluster to 3.5 cluster (with 3.5.6 host) and it functioned as expected (Win10 VM can be run on 3.6 engine with 3.5 cluster) Host CPU type was Opteron G3. Created attachment 1090194 [details]
Reassign engine log
Decreasing cluster version is allowed only if: - there are no hosts in the cluster - the new version is not smaller than the DC version This is an old behavior which has not been touched by this patch. If this is the case than it is correct. Also, what exact version of engine are you testing this on? Strange is that on Opteron it was running... Can you please double check the cluster's CPU Type? Because on 3.6 branch I see this: os.windows_10.cpu.unsupported.value = conroe, penryn, nehalem, opteron_g1, opteron_g2, opteron_g3, opteron_g4, opteron_g5 Hi. I verified it again, this time i first removed all the hosts from cluster as you mentioned. Exact verification scenario: 1. Install windows 10 on 3.6 engine, 3.6 host (DC and cluster levels are 3.6). 2. Verify VM is running. 3. Power off VMs, remove host (verify no other hosts are in this cluster). 4. Reduce DC and cluster to compatibility version 3.5 5. Add 3.5 host to cluster. 6. Run Windows 10 VM. Actual result: VM is running properly. Tested versions: engine: rhevm-3.6.0.3-0.1.el6 (3.6.0-17) - 3.6 host: libvirt-client-1.2.17-5.el7.x86_64 vdsm-4.17.10.1-0.el7ev.noarch sanlock-3.2.4-1.el7.x86_64 qemu-kvm-rhev-2.3.0-31.el7.x86_64 - 3.5.6 host: vdsm-4.16.29-1.el7ev.x86_64 qemu-kvm-rhev-2.1.2-23.el7_1.10.x86_64 libvirt-client-1.2.8-16.el7_1.4.x86_64 sanlock-3.2.2-2.el7.x86_64 Cluster CPU type: AMD Opteron G3 Host CPU type: Quad-Core AMD Opteron(tm) Processor 2350 VM OS type was set to windows 10 x64 in webadmin During new VM creation (see bug https://bugzilla.redhat.com/show_bug.cgi?id=1278442) Please move bug to ON_QA and I'll verify it. Hi Nisim, current expected behavior is that Win10 (32bit, 64bit) does NOT run on (among others) Opteron G1-G5. I suggest following test procedure. 1. Make sure DC and Cluster levels are set to 3.5 2. Make sure Opteron G3 host is in this cluster 4. Make sure Cluster CPU is set to Opteron G3 5. Create VM, set guest OS to 'Other OS. (no real installation required) 6. (ASSERT) vm runs 7. Stop vm 8. Set host to maitenance 9. Set Cluster CPU to Haswell 10. Set VM guest OS to Win10 (no real installation required) 11. Set Cluster CPU to Opteron G3 12. Activate Host 13. Run vm Expected result: VM refuses to run with error message The guest OS doesn't support the following CPUs: opteron_g5, opteron_g3, opteron_g4, opteron_g1, opteron_g2, conroe, nehalem, penryn. Its possible to change the cluster cpu or set a different one per VM Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA. working since ~ 3.5.5 Westmere+ CPUS working fine (except el6 hosts with SandyBridge during W10 installation) |
Created attachment 1080958 [details] windows 10 error screenshot Description of problem: I had Windows 10 running on my 3.6 engine env (including RHEL7 hosts with 3.6 vdsm). I needed to test something so I modified Windows 10 VM settings and changed cluster to another one with 3.5 cluster level (no warning, no error - thus seemed to be OK). Then I started this (previously working OK) Windows 10 on RHEL6 hosts with 3.5 vdsm in this 3.5 cluster. Windows are restarting all over again with error: SYSTEM_THREAD_EXCEPTION_NOT_HANDLES If Windows 10 is not supported to be run on 3.5 cluster level (which allow RHEL6 hosts, thus allowing older qemu-kvm etc...), then there is missing check while a user is modifying VM configuration and changes cluster level. I'm speculating here but if Windows 10 is supported only on 3.6 cluster level then: - there should be always check what OS has VM defined and check if requested cluster level would support it - there should be a warning or it should not be allowed to change cluster level for specific OS types (to below levels) Version-Release number of selected component (if applicable): - host: libvirt-0.10.2-54.el6.x86_64 vdsm-4.16.27-1.el6ev.x86_64 kernel-2.6.32-573.7.1.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.479.el6_7.2.x86_64 - engine: rhevm-webadmin-portal-3.6.0-0.18.el6.noarch How reproducible: tried once Steps to Reproduce: 1. 3.6 engine, 3.6 cluster level, 3.6 host (RHEL7) 2. install Windows 10 3. 3.5 cluster level, 3.5 host (RHEL6) 4. while Windows 10 VM is down, modify cluster to 3.5 level 5. start Windows 10 VM Actual results: - no warning, no error about issues to downgrade cluster level for Windows 10 - Windows 10 is restarting all over again Expected results: no idea - either should not be allowed to downgrade cluster level for OS types which require new features from above cluster levels or there should be a warning - running Windows 10 on 3.5 cluster level? Additional info: