Bug 144668
Summary: | System doesn't reboot even if kernel.panic is > 0 on RHEL-4 Beta-2. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Sumit Sharma <ssharma3> |
Component: | kernel | Assignee: | Dave Anderson <anderson> |
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.2 | CC: | davej, linux26port, lockhart, ntachino, riel, rkenna, tburke |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHSA-2005-514 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-10-05 12:37:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 156322 |
Description
Sumit Sharma
2005-01-10 15:30:11 UTC
The kernel patch in question is this: --- linux-2.6.8/arch/x86_64/kernel/reboot.c.orig +++ linux-2.6.8/arch/x86_64/kernel/reboot.c @@ -120,7 +120,8 @@ void machine_shutdown(void) /* O.K Now that I'm on the appropriate processor, * stop all of the others. */ - smp_send_stop(); + if (!crashdump_mode()) + smp_send_stop(); #endif local_irq_disable(); @@ -130,15 +131,16 @@ void machine_shutdown(void) #endif disable_IO_APIC(); - - local_irq_enable(); + if (!crashdump_mode()) + local_irq_enable(); } void machine_restart(char * __unused) { int i; - machine_shutdown(); + if (!crashdump_mode()) /* temporary? */ + machine_shutdown(); /* Tell the BIOS if we want cold or warm reboot */ *((unsigned short *)__va(0x472)) = reboot_mode Note that before the patch was applied to machine_restart(), machine_shutdown() would be called unconditionally. With the patch, it is only called if we are *not* in diskdump/netdump mode. So if the netdump/diskdump patch were not applied, machine_shutdown() would be called, and as you have seen, would prevent the reboot operation from occurring. In other words, warm reboots would never work on x86_64 machines with that 2.6.9 code base, because, as I recall, the machine would hang in machine_shutdown(). By adding the netdump/diskdump patch, we avoid machine_restart() entirely (since the other cpus have been frozen and won't receive IP interrupts), and by your workaround, you essentially do the same thing. I will check whether the latest RHEL4 x86_64 kernel still is unable to do a warm reboot successfully. Can you confirm that when you call panic() from your kernel module that it is *not* in interrupt mode? Without configuring netdump or diskdump, can you confirm the what the results are if you: 1. Enter Alt-sysrq-b from the console device 2. Enter "echo b > /proc/sysrq-trigger" Never mind the tests requested above, I've found the problem. It's actually a bug in the upstream x86_64 kernel. When /proc/sys/kernel/panic is set, smp_send_stop() gets called twice when panic() is called -- once at the top of panic() itself, and if panic_timeout > 0, it gets called a second time via the machine_restart() to machine_shutdown() path. The x86_64 version of smp_send_stop() first sends IPI's to kill the other cpus with smp_really_stop_cpu(): void smp_send_stop(void) { int nolock = 0; /* Don't deadlock on the call lock in panic */ if (!spin_trylock(&call_lock)) { udelay(100); /* ignore locking because we have paniced anyways */ nolock = 1; } __smp_call_function(smp_really_stop_cpu, NULL, 1, 0); if (!nolock) spin_unlock(&call_lock); smp_stop_cpu(); } Note that smp_really_stop_cpu(), when it runs on the other cpus, calls smp_stop_cpu() before it actually shuts itself down: static void smp_really_stop_cpu(void *dummy) { smp_stop_cpu(); for (;;) asm("hlt"); } One of the functions of smp_stop_cpu() is to clear the processor's bit from the cpu_online_map: void smp_stop_cpu(void) { /* * Remove this CPU: */ cpu_clear(smp_processor_id(), cpu_online_map); local_irq_disable(); disable_local_APIC(); local_irq_enable(); } What's unique about x86_64 shutdown procedure, is that in the smp_send_stop() function above, after it shuts the other cpus down, it calls smp_stop_cpu() on *itself*, thereby removing its cpu bit from the cpu_online_map. That's the bug, because, upon calling smp_send_stop() the second time, it retries the __smp_call_function() to kill the other cpus with smp_really_stop_cpu(). That would normally be OK, but because the cpu_online_map is now equal to zero, the __smp_call_function() routine hangs because of this: static void __smp_call_function (void (*func) (void *info), void *info, int nonatomic, int wait) { static struct call_data_struct dumpdata; struct call_data_struct normaldata; struct call_data_struct *data; int cpus = num_online_cpus()-1; if (!cpus) return; Normally the calling cpu would have its bit set in the cpu_online_map, but because it's been cleared, num_online_cpus()-1 returns -1, as if there are 32 online cpus to send IPI's to. So it hangs waiting for them to respond. We'll fix this in an update. I verified it with latest update (RHEL4 U1), but the problem is still there. Seems that there is some other thing also involved in this matter. No reboot is taking place on calling panic explicitly. Exactly which kernel version are you testing this with? Kernel version is :- 2.6.9-6.28.ELsmp Ok -- the patch that was tested OK here is in 2.6.9-6.28. That patch prevented the cpu_online map being set to 0 prematurely, which would cause a hang in smp_call_function(). It's not obvious what has changed. Anyway, I'll re-open this case, and re-investigate... In this case, we have a different situation than the cpu_online map issue that was fixed before. Here, if the panic cpu is not the boot_cpu_id cpu, the hang occurs like this: 1. panic on a cpu other than boot_cpu_id. 2. panic() calls smp_send_stop(), which shuts down all other cpus, including the boot_cpu_id cpu. 3. panic() (if panic_timeout) then calls machine_restart(). 4. machine_restart() calls smp_halt() -- presumably to shut down the other cpus. (In a non-panic() situation, the other cpus would still be running.) 5. smp_halt() does this: #ifdef CONFIG_SMP static void smp_halt(void) { int cpuid = safe_smp_processor_id(); static int first_entry = 1; if (first_entry) { first_entry = 0; smp_call_function((void *)machine_restart, NULL, 1, 0); } smp_stop_cpu(); /* AP calling this. Just halt */ if (cpuid != boot_cpu_id) { for (;;) asm("hlt"); } /* Wait for all other CPUs to have run smp_stop_cpu */ while (!cpus_empty(cpu_online_map)) rep_nop(); } #endif Since the other cpus have already halted, the "first_entry" smp_call_function() is a no-op. Then, the subsequent (cpuid != boot_cpu_id) check causes all but the boot_cpu_id cpu to halt as well, so there is no return to machine_restart() to continue the reboot operation. So, if the panic cpu is not the boot_cpu_id cpu, then we hang... (actually all cpus have done a "hlt"). Presuming that the reboot operation can be done by other than the boot_cpu_id cpu, it looks like the best solution is to avoid calling smp_halt() if all the other cpus have already been halted, and let the non-boot cpu do the reboot operation. To be tested... The proposed fix is to have the "first_entry" smp_halt() return if the number of online cpus is 1, so that the machine_restart() can proceed: --- linux-2.6.9/arch/x86_64/kernel/reboot.c.orig 2005-04-07 15:02:02.297123520 -0400 +++ linux-2.6.9/arch/x86_64/kernel/reboot.c 2005-04-07 15:08:29.346283104 -0400 @@ -100,12 +100,15 @@ static void smp_halt(void) { int cpuid = safe_smp_processor_id(); - static int first_entry = 1; + static int first_entry = 1; - if (first_entry) { - first_entry = 0; - smp_call_function((void *)machine_restart, NULL, 1, 0); - } + if (first_entry) { + first_entry = 0; + /* If nobody's alive, just return to machine_restart */ + if (num_online_cpus() == 1) + return; + smp_call_function((void *)machine_restart, NULL, 1, 0); + } smp_stop_cpu(); It should also be noted that the kernel that the original patch was applied to (and which is still valid) did not contain the smp_halt() function above. smp_halt() replaced the former machine_shutdown() function, which did not have the (cpuid != boot_cpu_id) check. The patch in comment 14 isn't included in the latest U1-candidate kernel (2.6.9-6.39.EL) and although it has been posted to rhkernel-list, hasn't received any acks, so throwing this back into ASSIGNED. The above patch looks ok to me, but more appropriate patch may be this :- --- arch/x86_64/kernel/reboot.c.orig 2005-05-26 19:51:50.134178960 +0530 +++ arch/x86_64/kernel/reboot.c 2005-05-26 19:54:19.567461648 +0530 @@ -95,17 +95,18 @@ static void smp_halt(void) { int cpuid = safe_smp_processor_id(); - static int first_entry = 1; + static long first_entry; + static int restart_cpuid; + + if (!test_and_set_bit(0, &first_entry)) { + restart_cpuid = cpuid; + smp_call_function((void *)machine_restart, NULL, 1, 0); + } - if (first_entry) { - first_entry = 0; - smp_call_function((void *)machine_restart, NULL, 1, 0); - } - smp_stop_cpu(); /* AP calling this. Just halt */ - if (cpuid != boot_cpu_id) { + if (cpuid != restart_cpuid) { for (;;) asm("hlt"); } Because if we apply the patch at comment # 14, then comparison "if (cpuid != boot_cpu_id)" becomes unnecessary and this comparison forces us to assume that boot_cpu_id is online when machine_restart() is called. This suggested patch will remove that assumption also. I didn't want to make any assumptions re: the upstream mechanism which implies that all cpus except the boot_cpu_id should "hlt", and the boot_cpu_id should handle the machine_restart(). But in the panic case, we've got no choice but to allow the only remaining cpu running to attempt the machine_restart(). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-514.html |