Bug 1184155 - Dynamic tickless feature not working in RHEL7 KVM guest
Summary: Dynamic tickless feature not working in RHEL7 KVM guest
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Marcelo Tosatti
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Keywords:
Depends On: 1193629
Blocks: 1133060
TreeView+ depends on / blocked
 
Reported: 2015-01-20 17:05 UTC by Sunil Choudhary
Modified: 2015-11-19 21:15 UTC (History)
13 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2015-11-19 21:15:47 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:2152 normal SHIPPED_LIVE Important: kernel security, bug fix, and enhancement update 2015-11-20 00:56:02 UTC

Description Sunil Choudhary 2015-01-20 17:05:17 UTC
Description of problem:

Dynamic tickless feature is not working in RHEL 7 KVM guest.

Implement dynamic tickless option in RHEL7 host and KVM guest as per our below document.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Configuring_kernel_tick_time

This configuration is successfully applied on the host OS and local timer interrupt is reduced to about 1 Hz on adaptive tickless CPUs.
However this configuration takes no effect on guest OS and local timer interrupts remains at about 1000 Hz.

Version-Release number of selected component (if applicable):

kernel-3.10.0-123.el7.x86_64

How reproducible:

Always

Steps to Reproduce:

[root@rhel7-vm01 ~]# uname -a
Linux rhel7-vm01.example.com 3.10.0-123.el7.x86_64 #1 SMP Mon May 5 11:16:57 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

[root@rhel7-vm01 ~]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-123.el7.x86_64 root=UUID=7052cbac-b8bf-4792-b277-cad80005e177 ro rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap vconsole.font=latarcyrheb-sun16 vconsole.keymap=us nohz_full=1-3 isolcpus=1-3    

System Information
        Manufacturer: Red Hat
        Product Name: KVM
        Version: RHEL 7.0.0 PC (i440FX + PIIX, 1996)
        Serial Number: Not Specified
        UUID: 76582AC7-A8F5-4B75-943F-F64626D3FF5F
        Wake-up Type: Power Switch
        SKU Number: Not Specified
        Family: Red Hat Enterprise Linux

[root@rhel7-vm01 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [2268] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [2268] successful run completed in 1s

 Performance counter stats for 'taskset -c 1 stress -t 1 -c 1':

             1,000 irq_vectors:local_timer_entry                                   

       1.001219589 seconds time elapsed

Actual results:

Local timer interrupts remain at around 1000

Expected results:

Local timer interrupts should reduce to around 1

Comment 6 Luiz Capitulino 2015-04-24 13:37:55 UTC
Patch posted upstream:

http://marc.info/?l=kvm&m=142982357227117&w=2

Comment 8 Rafael Aquini 2015-06-09 15:13:07 UTC
Patch(es) available on kernel-3.10.0-262.el7

Comment 10 ShupingCui 2015-06-15 06:40:33 UTC
Steps to Reproduce:

# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-123.el7.x86_64 root=/dev/mapper/rhel_hp--z220--01-root ro rd.lvm.lv=rhel_hp-z220-01/root biosdevname=0 console=tty0 reboot=pci console=ttyS0,115200 crashkernel=auto rd.lvm.lv=rhel_hp-z220-01/swap rhgb quiet LANG=en_US.UTF-8 systemd.debug

# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [11093] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [11093] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

             1,000      irq_vectors:local_timer_entry                                   

       1.000820275 seconds time elapsed


Verify on 3.10.0-262.el7.x86_64
# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-262.el7.x86_64 root=/dev/mapper/rhel_hp--z220--01-root ro rd.lvm.lv=rhel_hp-z220-01/root biosdevname=0 console=tty0 reboot=pci console=ttyS0,115200 crashkernel=auto rd.lvm.lv=rhel_hp-z220-01/swap rhgb quiet LANG=en_US.UTF-8 systemd.debug nohz_full=1-7 isolcpus=1-7

# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [11153] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [11153] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                 5      irq_vectors:local_timer_entry                                   

       1.000817904 seconds time elapsed


But cannot see 1 tick, please correct me if I'm wrong.


Host cpu infos:
processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
stepping	: 9
microcode	: 0x1b
cpu MHz		: 1817.804
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
bogomips	: 6784.67
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:


Thanks,
Shuping

Comment 11 Luiz Capitulino 2015-06-15 15:36:02 UTC
I'll let Marcelo give the final answer on this, but although I can't tell  why local_timer_entry() is being called 5 times per second, I think that the requested feature _is_ working as we're down from 1,000 periodic calls to a few sporadic calls. Maybe we could open a new BZ to debug those calls, and move this one forward. But that's small issue, IMHO.

Also, I have a few more comments:

1. Keep in mind that even when full tickless is taking place, the kernel will setup a minimum of 1 timer interrupt per second. This is so because the scheduler must maintain some stats

2. When testing Marcelo fixes a few months ago, I'd see that KVM's timer function in the host (which emulates the tick) would be called twice a second (instead of once, as explained above). I haven't debduged this yet, but I don't think it's a huge issue either

3. Maybe local_timer_entry() is being called due to timers being setup in that CPU?

Comment 14 Marcelo Tosatti 2015-07-16 20:21:34 UTC
(In reply to Luiz Capitulino from comment #11)
> I'll let Marcelo give the final answer on this, but although I can't tell 
> why local_timer_entry() is being called 5 times per second, I think that the
> requested feature _is_ working as we're down from 1,000 periodic calls to a
> few sporadic calls. Maybe we could open a new BZ to debug those calls, and
> move this one forward. But that's small issue, IMHO.
> 
> Also, I have a few more comments:
> 
> 1. Keep in mind that even when full tickless is taking place, the kernel
> will setup a minimum of 1 timer interrupt per second. This is so because the
> scheduler must maintain some stats
> 
> 2. When testing Marcelo fixes a few months ago, I'd see that KVM's timer
> function in the host (which emulates the tick) would be called twice a
> second (instead of once, as explained above). I haven't debduged this yet,
> but I don't think it's a huge issue either
> 
> 3. Maybe local_timer_entry() is being called due to timers being setup in
> that CPU?

stress uses "alarm" and "nanosleep" syscalls, both of them make use of timers.

Comment 15 mazhang 2015-07-27 07:59:45 UTC
Test this bug on 3.10.0-261.el7.x86_64 and 3.10.0-300.el7.x86_64

Result:
3.10.0-261.el7.x86_64

[root@dhcp-11-16 ~]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-261.el7.x86_64 root=/dev/mapper/rhel_dhcp--11--16-root ro crashkernel=auto rd.lvm.lv=rhel_dhcp-11-16/root rd.lvm.lv=rhel_dhcp-11-16/swap nohz_full=1-7 isolcpus=1-7 LANG=en_US.UTF-8
[root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [10808] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [10808] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                 6      irq_vectors:local_timer_entry

       1.023972406 seconds time elapsed

[root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [14390] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [14390] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                12      irq_vectors:local_timer_entry

       1.000963412 seconds time elapsed

[root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [14426] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [14426] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                 8      irq_vectors:local_timer_entry

       1.000885183 seconds time elapsed


3.10.0-300.el7.x86_64

[root@dhcp-11-16 ~]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-300.el7.x86_64 root=/dev/mapper/rhel_dhcp--11--16-root ro crashkernel=auto rd.lvm.lv=rhel_dhcp-11-16/root rd.lvm.lv=rhel_dhcp-11-16/swap nohz_full=1-7 isolcpus=1-7 LANG=en_US.UTF-8
[root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [14689] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [14689] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                10      irq_vectors:local_timer_entry                                   

       1.001913108 seconds time elapsed

[root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [14692] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [14692] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                 7      irq_vectors:local_timer_entry                                   

       1.000826672 seconds time elapsed

[root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [14695] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [14695] successful run completed in 1s

 Performance counter stats for 'CPU(s) 1':

                 9      irq_vectors:local_timer_entry                                   

       1.000824091 seconds time elapsed


Seems there are no difference between 3.10.0-261.el7.x86_64 and 3.10.0-300.el7.x86_64.

Marcelo, please correct me if the test steps is wrong.

Thanks,
Mazhang.

Comment 16 mazhang 2015-07-27 08:00:37 UTC
Host info:
processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
stepping	: 9
microcode	: 0x1b
cpu MHz		: 2442.421
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
bogomips	: 6784.48
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

[root@dhcp-11-16 ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           7766         300        6903           8         562        7234
Swap:          7935           0        7935

Comment 17 Marcelo Tosatti 2015-07-27 16:37:35 UTC
(In reply to mazhang from comment #15)
> Test this bug on 3.10.0-261.el7.x86_64 and 3.10.0-300.el7.x86_64
> 
> Result:
> 3.10.0-261.el7.x86_64
> 
> [root@dhcp-11-16 ~]# cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-3.10.0-261.el7.x86_64
> root=/dev/mapper/rhel_dhcp--11--16-root ro crashkernel=auto
> rd.lvm.lv=rhel_dhcp-11-16/root rd.lvm.lv=rhel_dhcp-11-16/swap nohz_full=1-7
> isolcpus=1-7 LANG=en_US.UTF-8
> [root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset
> -c 1 stress -t 1 -c 1
> stress: info: [10808] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
> stress: info: [10808] successful run completed in 1s
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>                  6      irq_vectors:local_timer_entry
> 
>        1.023972406 seconds time elapsed
> 
> [root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset
> -c 1 stress -t 1 -c 1
> stress: info: [14390] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
> stress: info: [14390] successful run completed in 1s
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>                 12      irq_vectors:local_timer_entry
> 
>        1.000963412 seconds time elapsed
> 
> [root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset
> -c 1 stress -t 1 -c 1
> stress: info: [14426] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
> stress: info: [14426] successful run completed in 1s
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>                  8      irq_vectors:local_timer_entry
> 
>        1.000885183 seconds time elapsed
> 
> 
> 3.10.0-300.el7.x86_64
> 
> [root@dhcp-11-16 ~]# cat /proc/cmdline 
> BOOT_IMAGE=/vmlinuz-3.10.0-300.el7.x86_64
> root=/dev/mapper/rhel_dhcp--11--16-root ro crashkernel=auto
> rd.lvm.lv=rhel_dhcp-11-16/root rd.lvm.lv=rhel_dhcp-11-16/swap nohz_full=1-7
> isolcpus=1-7 LANG=en_US.UTF-8
> [root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset
> -c 1 stress -t 1 -c 1
> stress: info: [14689] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
> stress: info: [14689] successful run completed in 1s
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>                 10      irq_vectors:local_timer_entry                       
> 
> 
>        1.001913108 seconds time elapsed
> 
> [root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset
> -c 1 stress -t 1 -c 1
> stress: info: [14692] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
> stress: info: [14692] successful run completed in 1s
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>                  7      irq_vectors:local_timer_entry                       
> 
> 
>        1.000826672 seconds time elapsed
> 
> [root@dhcp-11-16 ~]# perf stat -C 1 -e irq_vectors:local_timer_entry taskset
> -c 1 stress -t 1 -c 1
> stress: info: [14695] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
> stress: info: [14695] successful run completed in 1s
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>                  9      irq_vectors:local_timer_entry                       
> 
> 
>        1.000824091 seconds time elapsed
> 
> 
> Seems there are no difference between 3.10.0-261.el7.x86_64 and
> 3.10.0-300.el7.x86_64.
> 
> Marcelo, please correct me if the test steps is wrong.
> 
> Thanks,
> Mazhang.

Yes it is correct. You might want to check the others vCPUs as well.

Comment 18 mazhang 2015-07-28 06:24:33 UTC
Reproduce this bug on kernel-3.10.0-123.el7.x86_64.

Host:
3.10.0-300.el7.x86_64
qemu-kvm-rhev-2.1.2-16.el7.x86_64

Guest:
3.10.0-123.el7.x86_64

Result:
# uname -r
3.10.0-123.el7.x86_64
# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-123.el7.x86_64 root=UUID=ded051b2-66ad-40f8-b601-509a061dd98a ro rd.lvm.lv=rhel_virtlab-66-84-114/root console=tty0 vconsole.keymap=us rd.lvm.lv=rhel_virtlab-66-84-114/swap console=ttyS0,115200 crashkernel=auto vconsole.font=latarcyrheb-sun16 biosdevname=0 nohz_full=1-3 isolcpus=1-3
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [3119] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [3119] successful run completed in 1s

 Performance counter stats for 'taskset -c 1 stress -t 1 -c 1':

               999 irq_vectors:local_timer_entry

       1.001063742 seconds time elapsed


Update guest kernel to 3.10.0-300.el7.x86_64 and re-test.

Result:
# uname -r
3.10.0-300.el7.x86_64
# cat /proc/cmdline                          
BOOT_IMAGE=/vmlinuz-3.10.0-300.el7.x86_64 root=/dev/mapper/rhel_virtlab--66--84--114-root ro rd.lvm.lv=rhel_virtlab-66-84-114/root console=tty0 vconsole.keymap=us rd.lvm.lv=rhel_virtlab-66-84-114/swap console=ttyS0,115200 crashkernel=auto vconsole.font=latarcyrheb-sun16 biosdevname=0 nohz_full=1-3 isolcpus=1-3 LANG=en_US.UTF-8
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
stress: info: [3010] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [3010] successful run completed in 1s

 Performance counter stats for 'taskset -c 1 stress -t 1 -c 1':

                 7 irq_vectors:local_timer_entry

       1.025816153 seconds time elapsed

This bug has been fixed.

Comment 19 errata-xmlrpc 2015-11-19 21:15:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-2152.html


Note You need to log in before you can comment on or make changes to this bug.