LTC Owner is: dvhltc.com LTC Originator is: dvhltc.com Reported by perf team, needs to be validated and possibly fixed. This does not seem to be a problem on an x460, as reported by the BULL team. -Darren ---------------------------------------------------------------------------------- I was able to boot with maxcpus=1 on elm3b102 (LS41): dvhart@elm3b102:~$ uname -a Linux elm3b102.beaverton.ibm.com 2.6.16-rtj12.11.3smp #1 SMP PREEMPT Tue Apr 24 14:08:21 PDT 2007 i686 athlon i386 GNU/Linux dvhart@elm3b102:~$ cat /proc/cmdline ro root=LABEL=/ console=tty0 console=ttyS1,19200 crashkernel=64M@16M maxcpus=1 dvhart@elm3b102:~$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 8212 stepping : 2 cpu MHz : 2000.276 cache size : 1024 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8legacy ts fid vid ttp tm stc bogomips : 4002.97 I'll try to get Mark Peloquin to describe the approach he took that failed, and how it failed. -Darren ---------------------------------------------------------------------------------- maxcpus=2 also works -Darren ----------------------------------------------------------------------------------- Setting maxcpus=1 on elm3b210 causes the system to hang on boot with the last message being: pci_hotplug: PCI Hot Plug PCI Core Version: 0.5 The entry as seen in Grub on boot is: kernel /boot/vmlinuz-2.6.20-0119.rt8 ro root=LABEL=/ console=tty0, console=ttyS1,19200 maxcpus=1 This appears to be quite a different kernel, the uname -a output is: Linux elm3b210.beaverton.ibm.com 2.6.20-0119.rt8 #1 SMP PREEMPT Thu Feb 15 15:53:15 CET 2007 x86_64 x86_64 x86_64 GNU/Linux It looks like the tests done by Darren were on a x86 2.6.16 base while my system has a x86_64 2.6.20 base. -KARL --------------------------------------------------------------------------------------- confirmed on rhel5-rt 2.6.20-0119.rt8, trying with 2.6.21-rt now. -Darren ----------------------------------------------------------------------------------------- 2.6.21-2.el5rt fails in the same place, trying stock rhel5 kernel. -Darren ---------------------------------------------------------------------------------- maxcpus=1 works on stock RHEL5 (2.6.18-8.el5). This limitation with the -rt kernels is blocking -rt scalability analysis. -Darren -------------------------------------------------------------------------------------- Also reproduced on an LS21. Adding initcall_debug to the boot line got the following extra output: pci_hotplug: PCI Hot Plug PCI Core version: 0.5 initcall 0xffffffff817a42a4: pci_hotplug_init+0x0/0x5a() returned 0. initcall 0xffffffff817a42a4 ran for 24 msecs: pci_hotplug_init+0x0/0x5a() Calling initcall 0xffffffff817a46f2: fb_console_init+0x0/0x12b() initcall 0xffffffff817a46f2: fb_console_init+0x0/0x12b() returned 0. initcall 0xffffffff817a46f2 ran for 0 msecs: fb_console_init+0x0/0x12b() Calling initcall 0xffffffff817a4c37: acpi_reserve_resources+0x0/0xeb() initcall 0xffffffff817a4c37: acpi_reserve_resources+0x0/0xeb() returned 0. initcall 0xffffffff817a4c37 ran for 0 msecs: acpi_reserve_resources+0x0/0xeb() Calling initcall 0xffffffff817a5abd: acpi_fan_init+0x0/0x5e() initcall 0xffffffff817a5abd: acpi_fan_init+0x0/0x5e() returned 0. initcall 0xffffffff817a5abd ran for 0 msecs: acpi_fan_init+0x0/0x5e() Calling initcall 0xffffffff817a5bf8: irqrouter_init_sysfs+0x0/0x38() initcall 0xffffffff817a5bf8: irqrouter_init_sysfs+0x0/0x38() returned 0. initcall 0xffffffff817a5bf8 ran for 0 msecs: irqrouter_init_sysfs+0x0/0x38() Calling initcall 0xffffffff817a5d8f: acpi_processor_init+0x0/0xdf() -John
----- Additional Comments From dvhltc.com 2007-05-19 03:12 EDT ------- I tested mainline 2.6.21 and it does boot with maxcpus=1. 2.6.21-rt1, 2, and 4 all hang at: Calling initcall 0xffffffff817a41df: acpi_processor_init+0x0/0xdf() when booted with initcall_debug. I traced this only as far as the call to acpi_bus_register_driver. So this was definitely introduced by the -rt patch. I'm not sure if I should try and see when it was introduced (as 2.6.16-rt22 does not fail) or if I should head "down the acpi rabbit hole" as John S. put it...
Created attachment 155114 [details] maxcpus-ignore-offline-cpus.patch
----- Additional Comments From dvhltc.com 2007-05-21 12:43 EDT ------- Ignore bogus acpi info Thomas Gleixner provided the attached patch. When I first booted with this patch I received the following in a loop: irq 9: nobody cared (try booting with the "irqpoll" option) Call Trace: [<ffffffff8106d5a4>] dump_trace+0xaa/0x32a [<ffffffff8106d865>] show_trace+0x41/0x5c [<ffffffff8106d895>] dump_stack+0x15/0x17 [<ffffffff810c50b8>] __report_bad_irq+0x38/0x87 [<ffffffff810c52cb>] note_interrupt+0x1c4/0x1fc [<ffffffff810c458d>] thread_simple_irq+0x6c/0x7e [<ffffffff810c4dc3>] do_irqd+0x14a/0x3e4 [<ffffffff81033d3a>] kthread+0xf5/0x128 [<ffffffff8105ff68>] child_rip+0xa/0x12 handlers: [<ffffffff8117736e>] (acpi_irq+0x0/0x1b) I then tried to boot with acpi=noirq and I got all the way to a login prompt. As we have seen this "nobody cared" and child_rip dump issues before - I think these are independent issues that should be tracked in their own bugs.
----- Additional Comments From dvhltc.com 2007-05-21 12:58 EDT ------- Ingo has included tglx's patch in 2.6.21-rt5
changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |FIXEDAWAITINGTEST Resolution| |FIX_ALREADY_AVAIL ------- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-05-24 13:04 EDT ------- Verified fixed in 2.6.21-14.el5rt.