=Comment: #0================================================= Sripathi Kodi <sripathi.com> - 2008-02-19 06:17 EDT We have seen that some config options in RH's 2.6.24-21.el5rt kernel are affecting latencies in our tests. We need these options changed to improve latencies. Options CONFIG_CPU_FREQ and CONFIG_NO_HZ should be turned OFF for predictable latencies. Also, with CONFIG_FAIR_GROUP_SCHED on, specJBB produced an oops. We have reported it to linux-rt-users with the subject line "Oops while running Specjbb on -rt kernel". We believe this code is not yet perfect. Hence we need CONFIG_FAIR_GROUP_SCHED turned OFF. So we need: CONFIG_CPU_FREQ to be turned OFF CONFIG_NO_HZ to be turned OFF CONFIG_FAIR_GROUP_SCHED to be turned OFF. =Comment: #2================================================= Sripathi Kodi <sripathi.com> - 2008-02-22 06:04 EDT In addition to the above, we are analyzing the effect of CONFIG_NUMA and CONFIG_CPU_IDLE on the latencies. So in summary: The following options should be turned OFF: CONFIG_CPU_FREQ CONFIG_NO_HZ CONFIG_FAIR_GROUP_SCHED CONFIG_RELOCATABLE should be turned ON. There is another bug, https://bugzilla.redhat.com/show_bug.cgi?id=432378 specifically for this option.
------- Comment From jstultz.com 2008-02-26 22:02 EDT------- Minor update on this: From the discussion on monday, RH calimed CONFIG_FAIR_GROUP_SCHED would be disabled in a future release.
The good news is that we've turned off CONFIG_FAIR_GROUP_SCHED. The bad news (from your perspective) is that turning off CPU_FREQ and NO_HZ is not as easy a decision. If we were focusing strictly on latency then we could do this and default the system to poll=idle and we'd have the best latency we could. Unfortunately we cannot ignore the power savings that NO_HZ and the frequency governors buy us. So, I'd like to try and quantify the sorts of performance hits we're seeing with CPU_FREQ and NO_HZ, so that we can address those rather than tossing them out. Clark
We'd like to get some more information concerning NO_HZ and CPU_FREQ. We've specifically seen workloads where latency was improved by booting with nohz=true (since it's disabled by default currently), so we'd like to see what kind of workload is being hurt by having NO_HZ code in place. Also, what settings did you have for the CPU_FREQ governors? Can you share any test code or can we try and duplicate the test conditions? Have you tried using a later 2.6.24 kernel than the one listed in the original report?
------- Comment From sripathi.com 2008-03-13 07:19 EDT------- We are running some tests to get comparison numbers for NO_HZ. For CPU_FREQ, we have some numbers. CONFIG_CPU_FREQ does not seem to have much impact on latencies on LS21 and HS21 blades. However, when run on an Intellistation zPro, we see a significant impact of this. I suspect it depends on how much of power control is supported by the hardware. There is a setting in the BIOS of zPro to disable power management, which makes it perform much better. However, we saw the best numbers when CONFIG_CPU_FREQ was disabled in the kernel. Some numbers below. Except for one, they are from the rt-test suite that is now part of LTP. test name || MRG kernel 2.6.24.3-29.el5rt || MRG with CPU_FREQ disabled. async handler || Max: 32 us/ Avg: 8.0527 us || Max: 25 us / Avg: 5.1222 us gtod latency || Max: 7 us /Avg: 0.5148 us || Max: 2 us /Avg: 0.1292 us pi_perf || Max = 85 us /Avg = 46.69 us || Max = 60 us /Average = 40.63 us pthread_kill_lat || Max : 22 us / Avg : 8.1215 us || Max : 19 us / abg : 5.7992 sched_jitter || max jitter: 305.623993 us || max jitter: 141.701004 us sched_latency || max : 9 us / avg : 5 us || max : 6 us /avg : 3 us martix mult Sequential: || Max: 108219 /Avg :107882.4609 us || Max: 71929 us /Avg: 71808.7734 us Concurrent( 4x): || Max: 27600 /Avg : 27097.2109 || Max : 18421 us /Avg : 18037.0293 us Seq/Conc Ratios : || Max: 2.9210 /Avg: 3.9813 || Max: 3.9047 /Avg : 3.9812 Proprietary benchmark(100run) || 77/100 PASS || 95/100 PASS
------- Comment From jstultz.com 2008-03-17 16:25 EDT------- Sripathi: So does idle=poll along with the bios disable effect this at all?
------- Comment From jstultz.com 2008-03-25 19:25 EDT------- As discussed on the call, we're now inline w/ the current MRG config for everything except RELOCATE and KDUMP (covered by ltc bug #42253 and RH bug #432378). I think CONFIG_NUMA is fine to be left alone, as we can boot w/ numa=off if necessary (and we can still hunt down the performance issue in the meantime). So I think this can be marked resolved.