Red Hat Bugzilla – Bug 524719
Xen hypervisor doesn't mask xsave feature from the guest; Fedora 11 PV domU kernel crashes
Last modified: 2009-12-14 16:27:16 EST
Description of problem:
F11 Xen PV domU kernel crashes early on boot during cpu initialization with invalid opcode. Crash happens in xsave_cntxt_init() when xsetbv instruction is executed.
Crash happens because of missing CPU xsave feature masking in Xen hypervisor.
Version-Release number of selected component (if applicable):
Dom0 is RHEL 5.4, 2.6.18-164.el5xen.
Always, if you have the "correct" hardware.
Steps to Reproduce:
1. Use virt-install (or virt-manager) to start F11 Xen PV domu installation
2. guest kernel crashes
guest kernel crashes early with invalid opcode.
guest kernel starts and works ok.
Analysis and the bugfix changeset number from Jeremy Fitzhardinge:
F11 guest kernel boot/crash logs:
Cool, thanks for the workaround in the domU kernel. Note that I have a patch pending for RHEL-5.5 to actually properly do the masking in the RHEL-5 dom0 kernel (https://bugzilla.redhat.com/show_bug.cgi?id=502826), so either way we should be fixed.
Chris: do you need testing for that -164.el5virttest17 kernel?
Another solution from Jeremy here:
But I guess RHEL5 Xen doesn't support custom cpuid masking per domU.. would be nice to be able to do that aswell :)
(In reply to comment #2)
> Chris: do you need testing for that -164.el5virttest17 kernel?
Additional testing is welcome, especially since I haven't tested it specifically to mask out fxsave (although I did test it to properly mask out GBpages).
> Another solution from Jeremy here:
> But I guess RHEL5 Xen doesn't support custom cpuid masking per domU.. would be
> nice to be able to do that aswell :)
Right. I'm not sure how invasive that is, given that RHEL-5 is getting a bit long in the tooth. Nevertheless, if you feel it is a worthwhile feature, open up a bug against the RHEL-5 xen package and we'll see what we can do.
It looks like -164.el5virttest17 doesn't help. F11 GA kernel still crashes on the same way as earlier.
(early) Initializing CPU#0
(early) invalid opcode: 0000 [#1] (early) SMP (early)
Oh, yuck. I didn't port that part back. OK, I'll have to respin the patch for BZ 502826 with the NOXSAVE part backported. I'll keep you informed.
OK, I've now uploaded a new RHEL-5 dom0 kernel that should properly mask xsave. You can get it from:
Please let me know if that works for you.
Chris: -166.el5virttest18 fixed the problem! F11 GA PV domU boots/starts OK now.
(In reply to comment #7)
> Chris: -166.el5virttest18 fixed the problem! F11 GA PV domU boots/starts OK
Excellent, thanks for the testing.
Chris: I also opened a bug against rhel5 xen per your suggestion.
Is this fix already included in the -170 kernel?
Ah, I forgot all about this. I'm not 100% certain which kernel it went into, but it definitely went into the -172 kernel available here:
I'm actually going to close this as a dup since the fix for this went in along with the rest of the fixes for BZ 502826.
*** This bug has been marked as a duplicate of bug 502826 ***