Bug 788562
Summary: | kvm guest hangs when hot-plugged vcpu is onlined due to uninitialized hv_clock | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Igor Mammedov <imammedo> | ||||
Component: | kernel | Assignee: | Igor Mammedov <imammedo> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.3 | CC: | chayang, drjones, dyuan, juzhang, kzhang, lersek, mzhan, shuang, sluo, uobergfe, xfu, ydu, yunzheng, yupzhang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-2.6.32-235.el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-06-20 08:23:35 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Igor Mammedov
2012-02-08 13:38:35 UTC
Correct link to upstream fix: http://www.spinics.net/lists/kvm/msg68054.html Created attachment 560267 [details]
introduce x86_cpuinit.early_percpu_clock_init hook
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. can not test it, because it's blocked by bug 562886, need cmd set_cpus to run cpu hotplug. ack it first. I realized this is a public BZ, so I'm inlining (and condensing) target of the previously added link: Before the patch: start_secondary [arch/x86/kernel/smpboot.c] smp_callin smp_store_cpu_info identify_secondary_cpu [arch/x86/kernel/cpu/common.c] mtrr_ap_init [arch/x86/kernel/cpu/mtrr/main.c] set_mtrr_from_inactive_cpu stop_machine_from_inactive_cpu [kernel/stop_machine.c] queue_stop_cpus_work cpu_stop_queue_work wake_up_process [kernel/sched.c] try_to_wake_up activate_task enqueue_task update_rq_clock sched_clock_cpu [kernel/sched_clock.c] sched_clock_local sched_clock [arch/x86/kernel/tsc.c] paravirt_sched_clock [arch/x86/include/asm/paravirt.h] kvm_clock_read [arch/x86/kernel/kvmclock.c] (1) pvclock_clocksource_read [arch/x86/kernel/pvclock.c] pvclock_get_nsec_offset() <-- access to uninited clock sets "last_value" to huge value kvm_setup_secondary_clock() [arch/x86/kernel/kvmclock.c] (2) kvm_register_clock() setup_secondary_APIC_clock [arch/x86/kernel/apic/apic.c] setup_APIC_timer() (1) via "pv_time_ops.sched_clock", set by kvmclock_init() in [arch/x86/kernel/kvmclock.c] (2) via "x86_cpuinit.setup_percpu_clockev", set by kvmclock_init() in [arch/x86/kernel/kvmclock.c] Patch: - Adds new "early_percpu_clock_init" hook member to "x86_cpuinit_ops" struct type. - New "x86_cpuinit.early_percpu_clock_init" defaults to x86_init_noop(). - kvmclock_init() overrides the new hook to kvm_setup_secondary_clock(), *leaves* old hook ("setup_percpu_clockev") at the default setup_secondary_APIC_clock(). - The patch removes the setup_secondary_APIC_clock() invocation from kvm_setup_secondary_clock(). - start_secondary() calls the new hook (x86_cpuinit.early_percpu_clock_init) before smp_callin(). New call tree on the bare metal: - the new hook defaults to no-op. - the patch doesn't change how the pre-existent hook is set up on the bare-metal. New call tree in KVM guest: start_secondary [arch/x86/kernel/smpboot.c] kvm_setup_secondary_clock [arch/x86/kernel/kvmclock.c] (1) kvm_register_clock smp_callin [arch/x86/kernel/smpboot.c] smp_store_cpu_info identify_secondary_cpu [arch/x86/kernel/cpu/common.c] mtrr_ap_init [arch/x86/kernel/cpu/mtrr/main.c] set_mtrr_from_inactive_cpu stop_machine_from_inactive_cpu [kernel/stop_machine.c] queue_stop_cpus_work cpu_stop_queue_work wake_up_process [kernel/sched.c] try_to_wake_up activate_task enqueue_task update_rq_clock sched_clock_cpu [kernel/sched_clock.c] sched_clock_local sched_clock [arch/x86/kernel/tsc.c] paravirt_sched_clock [arch/x86/include/asm/paravirt.h] kvm_clock_read [arch/x86/kernel/kvmclock.c] (1) pvclock_clocksource_read [arch/x86/kernel/pvclock.c] pvclock_get_nsec_offset() <-- clock already inited setup_secondary_APIC_clock [arch/x86/kernel/apic/apic.c] (2) setup_APIC_timer() (1) via new early hook (2) via preexistent hook which now has a different (= default) value. Patch(es) available on kernel-2.6.32-235.el6 reproduce this issue with kernel 2.6.32-220.el6.x86_64 steps to reproduce: 1.boot a guest /usr/libexec/qemu-kvm -M rhel6.2.0 -m 2048 -smp 1,sockets=1,cores=1,threads=1,maxcpus=6 -enable-kvm -uuid 4541c99e-efbe-4624-beb0-13ca5193fc79 -k en-us -drive file=/home/RHEL-Server-6.2-64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=koTUXQrb,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=1,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -net none -monitor stdio -vnc :1 -serial unix:/home/unix.socket,server,nowait 2. hot plug a vcpu via monitor (qemu) cpu_set 1 online 3. check guest actual result: guest hang (qemu) info status VM status: paused verify this issue with kernel 2.6.32-270.el6.x86_64 repeat step1 step2 and step3 actual result: guest work well, and hotplug vcpu successful so this bug is fixed Moving to VERIFIED as per Comment #10 *** Bug 799180 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0862.html *** Bug 832946 has been marked as a duplicate of this bug. *** *** Bug 831899 has been marked as a duplicate of this bug. *** *** Bug 970968 has been marked as a duplicate of this bug. *** |