=Comment: #0================================================= Sripathi Kodi <sripathi.com> - 2008-02-25 10:31 EDT *** This is a "Task" bug. This has been opened to track a particular task, not necessarily a bug. *** Please evaluate whether enabling CONFIG_NUMA affects real-time latencies. RH has enabled this option in MRG kernel. John Stultz has worked on this in bug #40270. He has not reached a definitive answer yet. =Comment: #1================================================= Vernon Mauery <mauery.com> - 2008-02-28 19:30 EDT I am currently gathering results for the baseline RH kernel on an LS20. This should finish up this evening, when I will boot it to a machine that has CONFIG_NUMA disabled and start up the tests again. I should have results tomorrow that I can post here. =Comment: #2================================================= Vernon Mauery <mauery.com> - 2008-02-29 10:31 EDT I must have done something wrong with the config file. My new kernel does not boot. I get this: Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: init Not tainted 2.6.24-21 #1 Call Trace: [<ffffffff8023d200>] panic+0xaf/0x169 [<ffffffff8049ff45>] do_page_fault+0x3f6/0x769 [<ffffffff80335a05>] lock_list_del_init+0x7c/0xaf [<ffffffff80255de2>] blocking_notifier_call_chain+0xf/0x11 [<ffffffff80240981>] do_exit+0x8d/0x823 [<ffffffff802411a6>] sys_exit_group+0x0/0x14 [<ffffffff802411b8>] sys_exit_group+0x12/0x14 [<ffffffff8020c21e>] system_call+0x7e/0x83 when trying to boot. =Comment: #3================================================= Vernon Mauery <mauery.com> - 2008-02-29 13:26 EDT I think I was just hit by the same abat bug as Darren was yesterday. I had an empty /etc/modprobe.conf file so the newly installed initrd was not configured correctly. I have booted the CONFIG_NUMA=n kernel and will run 100 calibrate runs like I did on the original kernel. =Comment: #4================================================= Vernon Mauery <mauery.com> - 2008-02-29 18:41 EDT I have run 100 full calibrate runs on the MRG kernel and another 100 on the MRG kernel with CONFIG_NUMA disabled. Basic inspection: vhmauery@elm3b213 $ grep SUMMARY logs.numa/* | grep -v "0 FAIL" | wc -l 38 vhmauery@elm3b213 $ grep SUMMARY logs.nonuma/* | grep -v "0 FAIL" | wc -l 19 We have twice as many runs with one or more tests failing when CONFIG_NUMA is enabled as when it is not. Slightly more detailed results: I ran the results through calibrate/sum_results.py and diffed them. This is the output: --- nonuma.results 2008-02-29 18:25:42.000000000 -0500 +++ numa.results 2008-02-29 18:25:52.000000000 -0500 @@ -16,7 +16,7 @@ Checks abs(Start Latency) < 100 µs PASS: 100 FAIL: 0 NHRT: Checks abs(Maximum Start) < 100 µs - PASS: 100 FAIL: 0 + PASS: 99 FAIL: 1 NHRT: Checks abs(Start Latency) < 100 µs PASS: 100 FAIL: 0 @@ -32,9 +32,9 @@ Multi-Processor Performance ------------------------------ Concurrent Time * 2.0 < Sequential Time - PASS: 98 FAIL: 2 + PASS: 95 FAIL: 5 XML: Concurrent Time * 2.0 < Sequential Time - PASS: 99 FAIL: 1 + PASS: 93 FAIL: 7 ------------------------------ Just-In-Time Compilation Jitter @@ -80,7 +80,7 @@ Impact on scheduling latency, GC Latency < NO-GC Latency + 100 µs PASS: 100 FAIL: 0 Impact on execution time. GC Duration < 1.1 NO-GC Duration (10% penalty) - PASS: 99 FAIL: 1 + PASS: 100 FAIL: 0 ------------------------------ NoHeapRealtimeThread Memory Allocation @@ -94,9 +94,9 @@ Dispatch Latency ------------------------------ Bound Handler Latency < 70 µs - PASS: 96 FAIL: 4 + PASS: 83 FAIL: 17 Async Handler Latency < 100 µs - PASS: 89 FAIL: 11 + PASS: 79 FAIL: 21 ------------------------------ Memory Check Penalty If it is necessary, I can go through and find some actual latency numbers to back up my argument, but I think that upon this amount of cursory inspection, this myth is busted! We should tell RedHat to disable CONFIG_NUMA. =Comment: #5================================================= Vernon Mauery <mauery.com> - 2008-02-29 18:42 EDT I note that this also needs to be tested on an HS21 =Comment: #6================================================= Sripathi Kodi <sripathi.com> - 2008-03-04 05:43 EDT From the minutes of the MRG call, it looks like RH would like to keep this on. Is it possible to disable NUMA through a kernel command line option? I can't find any such option in kernel-parameters.txt. =Comment: #7================================================= Vernon Mauery <mauery.com> - 2008-03-04 09:27 EDT The final word on this bug from me. It appears that the machines affected most by this are either slow or AMD. Not sure which affects the tests results more. I would have to test on an LS21 to confirm, but still the HS21 is faster. The HS21 failed in 3% more tests with NUMA enabled. results.numa --- results.nonuma 2008-03-03 17:44:51.000000000 -0500 +++ results.numa 2008-03-04 09:20:32.000000000 -0500 @@ -24,7 +24,7 @@ Concurrency Jitter ------------------------------ Checks (maximum - minimum) < 200 µs - PASS: 99 FAIL: 1 + PASS: 100 FAIL: 0 Checks Start Jitter < 200 µs PASS: 100 FAIL: 0 @@ -94,9 +94,9 @@ Dispatch Latency ------------------------------ Bound Handler Latency < 70 µs - PASS: 100 FAIL: 0 + PASS: 99 FAIL: 1 Async Handler Latency < 100 µs - PASS: 97 FAIL: 3 + PASS: 96 FAIL: 4 ------------------------------ Memory Check Penalty This could be statistical noise. To be sure we would have to run 1000 runs of calibrate rather than 100, which would take about 33 hours (* 2 for config changes) or so. CONFIG_NUMA definitely affects the latency on an LS20, but only a little bit on the HS21.
------- Comment From sripathi.com 2008-03-17 14:42 EDT------- The BIOS setting for "Memory Node Interleave" on our machines "Disabled". This seems to be the default.
------- Comment From sripathi.com 2008-03-18 09:11 EDT------- Some more numbers. This time from rt-test tests that are part of LTP. CONFIG_NUMA did not make a significant impact on the runs on HS21, but it's impact was measurable on LS21. I will attach an html file to this bug that shows the comparison.
Created attachment 298389 [details] Effect of CONFIG_NUMA on latencies on an LS21 machine
------- Comment From dvhltc.com 2008-03-18 10:13 EDT------- (In reply to comment #13) > The BIOS setting for "Memory Node Interleave" on our machines "Disabled". This > seems to be the default. I believe this is what Clark mentioned to me as his expectation given our results. Can we also run with Memory Node Interleave Enabled to see how this effects the LS21 results?
------- Comment From sudhanshusingh.com 2008-03-20 07:56 EDT------- (From update of attachment 35546) results are for LS21. calibrate and C tests are run 100 times and average/max is taken of those runs.
Created attachment 298687 [details] comarison_with_mem_node_interleave_bios_setting Attachment contains comparison of results with BIOS setting of memory node interleave ( enabled and disbaled ) for MRG base kernel and MRG kernel with NUMA option turned off. (all four permutations).
------- Comment From sripathi.com 2008-03-26 01:57 EDT------- We decided on mailing lists and RH call that it is okay to leave NUMA turned ON. In case we discover problems later, we can use numa=off boot parameter.
------- Comment From dvhltc.com 2008-03-26 11:37 EDT------- Closing it. (rejecting as a note a bug... since there was no change made)
------- Comment From sripathi.com 2008-04-01 06:41 EDT------- Moving this bug to FIX_BY_IBM
Closing on our side.