| Summary: | Call trace found in guest dmesg after cpu hotplug in rt kernel | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | xiywang |
| Component: | kernel-rt | Assignee: | pagupta |
| kernel-rt sub component: | KVM | QA Contact: | Virtualization Bugs <virt-bugs> |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | medium | ||
| Priority: | unspecified | CC: | bhu, hhuang, juzhang, knoel, virt-maint, xfu, xiywang |
| Version: | 7.3 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-11-29 07:25:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
xiywang
2016-03-23 08:55:43 UTC
Hello,
I tried to look at stacktrace and the code.
kernel cmdline: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-366.rt56.243.el7.x86_64 root=/dev/mapper/rhel_dhcp--66--106--180-root ro crashkernel=auto rd.lvm.lv=rhel_dhcp-66-106-180/root rd.lvm.lv=rhel_dhcp-66-106-180/swap rhgb quiet LANG=en_US.UTF-8 isolcpus=1 nohz_full=1 <-----here
I found this error message:
"NO_HZ FULL will not work with unstable sched clock"--------------|
|
static bool can_stop_full_tick(void)
{
...
if (!sched_clock_stable()) {
trace_tick_stop(0, "unstable sched clock\n");
/*
* Don't allow the user to think they can get
* full NO_HZ with this machine.
*/
WARN_ONCE(tick_nohz_full_running,
"NO_HZ FULL will not work with unstable sched clock");
return false; \------> looks like it is returning
because of unstable clock source
}
...
}
tick_nohz_irq_exit
tick_nohz_full_stop_tick
can_stop_full_tick
It looks like because of unstable tsc and 'nohz_full' enabled system might not behave properly. Two options we have here:
1) Disable nohz_full and test the cpu hotplug.
2) Test this in other machine which have stable tsc.
Best regards,
Pankaj
(In reply to pagupta from comment #2) > Hello, > > I tried to look at stacktrace and the code. > > kernel cmdline: Command line: > BOOT_IMAGE=/vmlinuz-3.10.0-366.rt56.243.el7.x86_64 > root=/dev/mapper/rhel_dhcp--66--106--180-root ro crashkernel=auto > rd.lvm.lv=rhel_dhcp-66-106-180/root rd.lvm.lv=rhel_dhcp-66-106-180/swap rhgb > quiet LANG=en_US.UTF-8 isolcpus=1 nohz_full=1 <-----here > > I found this error message: > "NO_HZ FULL will not work with unstable sched clock"--------------| > | > static bool can_stop_full_tick(void) > { > ... > if (!sched_clock_stable()) { > trace_tick_stop(0, "unstable sched clock\n"); > /* > * Don't allow the user to think they can get > * full NO_HZ with this machine. > */ > WARN_ONCE(tick_nohz_full_running, > "NO_HZ FULL will not work with unstable sched > clock"); > return false; \------> looks like it is returning > > because of unstable clock source > } > > ... > } > > tick_nohz_irq_exit > tick_nohz_full_stop_tick > can_stop_full_tick > > It looks like because of unstable tsc and 'nohz_full' enabled system might > not behave properly. Two options we have here: > > 1) Disable nohz_full and test the cpu hotplug. > 2) Test this in other machine which have stable tsc. > > Best regards, > Pankaj I tried without nohz_full param, and cpu hotplug also have some issue, please check out below: host: kernel-rt: 3.10.0-513.rt56.419.el7.x86_64 cmdline: # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-513.rt56.419.el7.x86_64 root=/dev/mapper/rhel_hp--dl385pg8--01-root ro crashkernel=auto rd.lvm.lv=rhel_hp-dl385pg8-01/root rd.lvm.lv=rhel_hp-dl385pg8-01/swap rhgb quiet LANG=en_US.UTF-8 isolcpus=0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 default_hugepagesz=1G hugepagesz=1G hugepages=16 guest: kernel-rt: 3.10.0-513.rt56.419.el7.x86_64 cmdline: # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-513.rt56.419.el7.x86_64 root=/dev/mapper/rhel_dhcp--66--106--180-root ro crashkernel=auto rd.lvm.lv=rhel_dhcp-66-106-180/root rd.lvm.lv=rhel_dhcp-66-106-180/swap rhgb quiet LANG=en_US.UTF-8 console=tty0 console=ttyS0,115200 isolcpus=1 intel_pstate=disable nosoftlockup steps: 1. boot a guest with the same command line listed in comment 1 2. cat /proc/cpuinfo in guest # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 1 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 1 siblings : 1 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 2 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 2 siblings : 1 core id : 0 cpu cores : 1 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 3 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 3 siblings : 1 core id : 0 cpu cores : 1 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: 3. cpu hotplug in host (qemu) cpu-add 4 4. cat /proc/cpuinfo in guest, only one cpu added and no more error message displayed # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 1 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 1 siblings : 1 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 2 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 2 siblings : 1 core id : 0 cpu cores : 1 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 3 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 3 siblings : 1 core id : 0 cpu cores : 1 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 4 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 4 siblings : 1 core id : 0 cpu cores : 1 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: #dmesg ... [ 281.821259] smpboot: APIC(4) Converting physical 4 to logical package 4 [ 281.821372] CPU4 has been hot-added [ 281.840403] smpboot: Booting Node 0 Processor 4 APIC 0x4 [ 0.002000] kvm-clock: cpu 4, msr 1:3ff86101, secondary cpu clock [ 282.104112] KVM setup async PF for cpu 4 [ 282.104120] kvm-stealtime: cpu 4, msr 13b210200 [ 282.104173] Will online and init hotplugged CPU: 4 Hello Xiyue, Thanks for checking without 'nohz_full' parameter for unstable tsc. Looks like earlier issue passed and vCPU hotplug succeeded. New vCPU got hotplugged but some issue in its on-lining. I am trying to understand what's happening here. There are some known issues with vCPU hotplug with RT. There is re-structuring of CPU hotplug infrastructure in upstream based on known issues. Its big chunk of work which will take some time to stabilize. Meanwhile can you please check if this issue happens with non-rt kernel as well? Thanks, Pankaj Hi Pankaj, Here's the result of cpu hotplug on non-rt kernel. It looks worked well. guest: 3.10.0-514.el7.x86_64 1. boot a guest with the command listed in comment 1 2. cpu hotplug in host hmp (qemu) cpu-add 4 (qemu) cpu-add 5 (qemu) cpu-add 6 (qemu) cpu-add 7 (qemu) cpu-add 8 (qemu) cpu-add 9 3. cat /proc/cpuinfo in guest # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 1 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 1 siblings : 1 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 2 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 2 siblings : 1 core id : 0 cpu cores : 1 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 3 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 3 siblings : 1 core id : 0 cpu cores : 1 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 4 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 4 siblings : 1 core id : 0 cpu cores : 1 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 5 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 5 siblings : 1 core id : 0 cpu cores : 1 apicid : 5 initial apicid : 5 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 6 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 6 siblings : 1 core id : 0 cpu cores : 1 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 7 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 7 siblings : 1 core id : 0 cpu cores : 1 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 8 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 8 siblings : 1 core id : 0 cpu cores : 1 apicid : 8 initial apicid : 8 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: processor : 9 vendor_id : AuthenticAMD cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1000065 cpu MHz : 2294.248 cache size : 512 KB physical id : 9 siblings : 1 core id : 0 cpu cores : 1 apicid : 9 initial apicid : 9 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm art nopl pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm bogomips : 4588.49 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: 4. dmesg in guest ... [ 56.600929] smpboot: APIC(4) Converting physical 4 to logical package 4 [ 56.601058] CPU4 has been hot-added [ 57.212088] smpboot: Booting Node 0 Processor 4 APIC 0x4 [ 0.002000] kvm-clock: cpu 4, msr 1:3ff86101, secondary cpu clock [ 57.470961] KVM setup async PF for cpu 4 [ 57.470970] kvm-stealtime: cpu 4, msr 13b20f3c0 [ 57.471045] Will online and init hotplugged CPU: 4 [ 124.731848] smpboot: APIC(5) Converting physical 5 to logical package 5 [ 124.731936] CPU5 has been hot-added [ 125.117716] smpboot: Booting Node 0 Processor 5 APIC 0x5 [ 0.002000] kvm-clock: cpu 5, msr 1:3ff86141, secondary cpu clock [ 125.375523] KVM setup async PF for cpu 5 [ 125.375533] kvm-stealtime: cpu 5, msr 13b28f3c0 [ 125.375575] Will online and init hotplugged CPU: 5 [ 143.525151] smpboot: APIC(6) Converting physical 6 to logical package 6 [ 143.525233] CPU6 has been hot-added [ 143.546873] smpboot: Booting Node 0 Processor 6 APIC 0x6 [ 0.002000] kvm-clock: cpu 6, msr 1:3ff86181, secondary cpu clock [ 143.804749] KVM setup async PF for cpu 6 [ 143.804757] kvm-stealtime: cpu 6, msr 13b30f3c0 [ 143.804795] Will online and init hotplugged CPU: 6 [ 147.908971] smpboot: APIC(7) Converting physical 7 to logical package 7 [ 147.909080] CPU7 has been hot-added [ 148.331485] smpboot: Booting Node 0 Processor 7 APIC 0x7 [ 0.002000] kvm-clock: cpu 7, msr 1:3ff861c1, secondary cpu clock [ 148.587843] KVM setup async PF for cpu 7 [ 148.587853] kvm-stealtime: cpu 7, msr 13b38f3c0 [ 148.587904] Will online and init hotplugged CPU: 7 [ 149.788467] smpboot: APIC(8) Converting physical 8 to logical package 8 [ 149.788545] CPU8 has been hot-added [ 149.806692] smpboot: Booting Node 0 Processor 8 APIC 0x8 [ 0.002000] kvm-clock: cpu 8, msr 1:3ff86201, secondary cpu clock [ 150.059104] KVM setup async PF for cpu 8 [ 150.059113] kvm-stealtime: cpu 8, msr 13b40f3c0 [ 150.059178] Will online and init hotplugged CPU: 8 [ 169.158180] smpboot: APIC(9) Converting physical 9 to logical package 9 [ 169.158253] CPU9 has been hot-added [ 169.617825] smpboot: Booting Node 0 Processor 9 APIC 0x9 [ 0.002000] kvm-clock: cpu 9, msr 1:3ff86241, secondary cpu clock [ 169.873596] KVM setup async PF for cpu 9 [ 169.873605] kvm-stealtime: cpu 9, msr 13b48f3c0 [ 169.873679] Will online and init hotplugged CPU: 9 The result with RT kernel after 'cpu-add 4' also looks fine without nohz_full param in cmdline, since 'cpu-add id' is used to add a vCPU id x, not to add x vCPUs. But is it OK to remove the nohz_full param? I mean, it's added to the cmdline by default after active tuned-profiles-realtime. Should we always manually remove this param when we testing RT kernel related functions? Thanks -Xiyue Good, its working for RT as well. I think configuration which we use for KVM RT guest profile don't use 'nohz_full' for guest. As per my understanding reason for this as we run 'cyclictest'(sched_fifo) and 'stress'(sched_other) on same isolated CPU(which is realtime CPU). nohz_full' would be effective if there is only single task running. I also found below commit. https://github.com/jeremyeder/tuned-profiles-realtime/commit/bb0404695d608c884a9235a38215aa9a337af66e Could you please check which version of tuned guest profile you are using or maybe try using latest version. The reason for not using nohz_full in this BZ is **unstable tsc** which will fail or give undesired result. In normal system with stable tsc source, 'nohz_full' should make system to work but just latency will be on higher side. Best regards, Pankaj Hi Pankaj, Checked the tuned guest profile I'm using and turns out it's quite old: [root@guest ~]# rpm -qa | grep tuned-profiles-realtime tuned-profiles-realtime-2.5.1-4.el7.noarch I updated to the newest profile tuned-profiles-realtime-2.7.1-3.el7.noarch. After active the new profile, nohz_full doesn't appeared in guest. So the issue mentioned above is resolved. I have another question: In what situation 'unstable tsc' will be detected? 'Cause the testing host I'm using is quite normal and I don't understand how to check whether the tsc time source is stable or not. Thanks, Xiyue (In reply to xiywang from comment #7) > Hi Pankaj, > > Checked the tuned guest profile I'm using and turns out it's quite old: > > [root@guest ~]# rpm -qa | grep tuned-profiles-realtime > tuned-profiles-realtime-2.5.1-4.el7.noarch > > I updated to the newest profile tuned-profiles-realtime-2.7.1-3.el7.noarch. > After active the new profile, nohz_full doesn't appeared in guest. So the > issue mentioned above is resolved. ok, good. > > I have another question: In what situation 'unstable tsc' will be detected? It gets detected when tsc clock source is not stable means kernel is dependent on clock source for alot of its activities. At initial boot up and later as well kernel try to calibrate(complex algorithm) the clock source to take care of some delta but if kernel does not able to caliberate the source and think there might be something wrong with the hardware side and mark the clocksource 'unstable' > 'Cause the testing host I'm using is quite normal and I don't understand how > to check whether the tsc time source is stable or not. You can easily check in 'dmesg' logs > > Thanks, > Xiyue |