Bug 437028
Summary: | KVM: Intel/VMX: host kernel should support DEBUGCTLMSR=0x1d9 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jan Kratochvil <jan.kratochvil> |
Component: | kernel | Assignee: | fedora-kernel-kvm |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 19 | CC: | avi, berrange, clalance, crobinso, gansalmon, gcosta, itamar, jan.kratochvil, jforbes, jonathan, kernel-maint, madhu.chinakonda, madko, mjw, roland, virt-maint |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-3.11.1-200.fc19.x86_64 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-09-29 14:47:32 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jan Kratochvil
2008-03-11 18:27:27 UTC
The main question here is if KVM is reporting as a machine that should have that MSR. It's probably more or less kosher to freak all the way out when an unknown MSR is written. The (2.6.25) guest kernel code is supposed to detect the CPU models that don't have it. If KVM is reporting as hardware that does not have it but the guest kernel wrongly thinks it does have it, then it's the guest kernel's fault. A bunch of fixes for kvm just went in. Can you try the latest rawhide as both guest and host and see if that works? OK, on both host + guest running: kernel-2.6.25-0.121.rc5.git4.fc9.x86_64 and: kvm-63-2.fc9.x86_64 I got: kernel: kvm: 10897: cpu0 svm_set_msr: MSR_IA32_DEBUGCTLMSR 0x2, nop and the testcase result code 1 (FAIL) The problem still exists for host & guest: kernel-2.6.25-0.167.rc7.git2.fc9.x86_64 but this time on Intel Core2 T7200 (Lenovo T60) Checked kernel-2.6.25-0.170.rc7.git3.fc9 contains the code only for AMD: ./arch/x86/kvm/svm.c: pr_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n", It was (most probably) checked before in Comment 3 on: Dual-Core AMD Opteron(tm) Processor 8220 SE (Unaware where should be the detection Roland wrote about in the Comment 1.) Should be fixed in 2.6.25-final. Still crashing on kernel-2.6.25-1.fc9.x86_64: kvm: 7805: cpu0 unhandled wrmsr: 0x1d9 data 2 kvm: 7805: cpu0 unhandled wrmsr: 0x1d9 data 0 I will reopen it after a reboot to a more recent Rawhide kernel but this one is already 2.6.25-final. Still crashing in kernel-2.6.25-8.fc9.x86_64. (and I do not see a relevant changelog entry in kernel-2.6.25-14.fc9.x86_64 or kernel-2.6.25.1-1.fc10) I don't think the MSR is unsupported on the host CPU -- KVM is just failing to implement it in the guest. It should not crash the guest and it looks like the code that went into SVM should also go into vmx.c. Or even better since it's the same code maybe it should just be in arch/x86/kvm/x86.c:kvm_[gs]et_msr_common() ?? This commit fixed the problem in SVM: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a2938c807024ba30191e3bd593430c0659d75717 It looks pretty straightforward to move the code from commit a2938c80 into the generic KVM code so both Intel and AMD processors handle this case... It isn't straightforward. We need to see how Intel cpus handle last-branch-record virtualization. What can potentially be done is to allow writes to the MSR that don't turn on LBR, and only fail those that do. Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Just a confirmation the generated exception for the unsupported wrmsr types is right: RDMSR http://www.cs.inf.ethz.ch/stricker/lab/doc/intel-part1.pdf #GP(0) If the value in ECX specifies a reserved or unimplemented MSR address. WRMSR http://www.cs.inf.ethz.ch/stricker/lab/doc/intel-part2.pdf #GP(0) If the value in ECX specifies a reserved or unimplemented MSR address. Just the guest kernel should not crash on unsupported MSR register - it may happen for DEBUGCTLMSR=0x1d9 on real silicon i586 (->a different kernel Bug). On qemu-system-x86_64 of qemu-0.9.1-6.fc9.x86_64 it just ignores the wrmsr instructions for unknown registers - it does not crash. The testcase http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/block-step.c?cvsroot=systemtap will return exit code 2 as "unsupported". Still it is a qemu bug - as it does not support some essential MSR registers a generated exception would stop Linux kernel to boot: console [earlyser0] enabled end_pfn_map = 1048576 PANIC: early exception 0d rip 10:ffffffff81468302 error 0 cr2 0 Pid: 0, comm: swapper Not tainted 2.6.25.10-86.fc9.x86_64 #1 Call Trace: [<ffffffff81475500>] ? add_active_range+0x39/0xef [<ffffffff81468302>] ? mtrr_bp_init+0xda/0x137 [<ffffffff814675ff>] ? e820_end_of_ram+0x5c/0x6b [<ffffffff81465da9>] ? setup_arch+0x22d/0x4ee [<ffffffff8104dbcf>] ? clockevents_register_notifier+0x27/0x34 [<ffffffff8145f935>] ? start_kernel+0x76/0x2f4 [<ffffffff8145f1dc>] ? _sinittext+0x1dc/0x1e3 RIP 0x10 Considering this Bug as the KVM RFE for the DEBUGCTLMSR=0x1d9 support. Going to open another Bug for the ptrace detection whether DEBUGCTLMSR=0x1d9 is supported by the underlying hardware. Re-assigning kvm.ko bugs to the kvm package for easier tracking Just FYI; in current F-11 kvm, this block-step program no longer causes a guest crash. It now causes: kvm: 7962: cpu1 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x2, nop to be printed on the host dmesg. Additionally, the block-step program inside the guest now has a return code of 2. Is that sufficient to address this BZ, or are you asking for full LBR virtualization? Chris Lalancette No, block-step should return code 0 as everything working or 1 due to EIO. Return code 2 is not acceptable for bug-free kernel. Bug 456175 Comment 1 by Roland McGrath: > There is no x86-64 hardware without debugctlmsr, so that is just a kvm issue. [...] > The existing code (now upstream) checks >= 6 against the same number that's > shown in "cpu family". So that check would not let the K6 try it, and > PTRACE_SINGLEBLOCK would get EIO. > > The model check is compiled away by CONFIG_X86_DEBUGCTLMSR. [...] Therefore assuming Cced Roland McGrath does not accept x86_64 runtime model check which would be there only for KVM guests as any real x86_64 hardware supports debugctlmsr. The upstream x86 kernel maintainers can decide if the CONFIG_X86_DEBUGCTLMSR and/or arch_has_block_step() criteria should change. AFAIK the existing definitions are the right criteria for real hardware. If KVM folks want the kernel to use new criteria specially tailored for how KVM differs from real hardware, they should take that issue upstream. Sounds like the issue applies to F11 too; setting version to rawhide This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This message is a reminder that Fedora 11 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 11. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '11'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 11's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 11 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Verified block-step return code 2 on: host+guest kernel-2.6.32.11-99.fc12.x86_64 host qemu-system-x86-0.11.0-13.fc12.x86_64 This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping kernel-debug-2.6.35.6-50.fc14.x86_64 qemu-system-x86-0.12.5-1.fc13.x86_64 kvm: 6881: cpu0 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x2, nop Doesn't look like relevant kernel code has changed for a while, so moving to rawhide. This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19 Is this still a problem with 3.9 based F19 kernels? Yes; BTW the reproducer in Comment 0 is really simple to run. host: kernel-3.8.4-202.fc18.x86_64 guest: kernel-3.8.4-202.fc18.x86_64 *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs. Fedora 19 has now been rebased to 3.11.1-200.fc19. Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those. kernel-3.11.1-200.fc19.x86_64 Testcase returns rc 2 in guest and host kernel reports: kvm [PID]: vcpu2 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x2, nop It no longer crashes, just PTRACE_SINGLEBLOCK silently does the same what PTRACE_SINGLESTEP does, that is PTRACE_SINGLEBLOCK cannot be used in KVM. A better fix would be to support PTRACE_SINGLEBLOCK even in KVM but the crash is fixed, therefore closing this bug. |