Red Hat Bugzilla – Bug 140112
RHEL4 U1: init 0/poweroff not working
Last modified: 2013-08-05 21:09:32 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET
Description of problem:
init 0 and poweroff no longer work on Dell PowerEdge 4600, 1600SC,
1750. These functions did work properly on RHEL4 Beta 1.
The function acpi_power_off is in /drivers/acpi/sleep/poweroff.c. It
is supposed to place the system in the S5 power state. There were
several changes made between Beta1 and Beta2
in /drivers/acpi/hardware/hwsleep.c, specifically in
acpi_enter_sleep_state(). When these changes were reverted to Beta1
code, the system powers down properly.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Install RHEL4 Beta 2 on Dell PowerEdge (4600, 1600SC, 1750)
2.Issue an 'init 0' or 'poweroff'
Actual Results: System shuts down services, displays:
then hangs. System does not power down
Per Dell today: Amit says that they will retest this with the 751
kernel and post status.
The 751 kernel does not work, the system still hangs at
OK with Dell if we allow Len Brown of Intel (upstream ACPI maintainer)
to access this issue?
Yes, these are shipping systems.
Adding Len Brown and Geoff Gustafson of Intel to the cc: list. Len:
need you to jump in here, please!
init 0/poweroff work with the following kernels:
does not work with:
There's definitely a difference in the _PTS method on our systems:
The 1750 and 4600 (systems that are failing) have similar (complex)
PTS methods. All other systems that work have simple PTS methods.
000001aa: Method _PTS (\_PTS)
000001b1: ArgCount 1; NotSerialized
000001bb: WKSL (000001a0)
000001c0: WKEN (000001a5)
000001c6: WKEN (000001a5)
000001cf: WKSL (000001a0)
000001d4: WKEN (000001a5)
000001da: WKEN (000001a5)
00000118: Method _PTS (\_PTS)
0000011f: ArgCount 1; NotSerialized
0000008d: Method _PTS (\_PTS)
00000094: ArgCount 1; NotSerialized
Created attachment 108082 [details]
ACPI debug output during init 0 on PE1750
System is hanging within acpi_enter_sleep_state_prep; the call to
acpi_evaluate_object("_PTS") never returns.
Bumping this to a Sev 1 as we reproduced this on a 2800 (X26 BIOS),
but have not seen with a 2850.
Reproduced on the following systems:
init 0 works on the base 2.6.9 and 2.6.10 kernels, as well as beta1
kernel. Reverting to base 2.6.9 ACPI code did NOT fix the issue; so
it is being caused by something else.
Are there any kernel.org kernels that fail,
or is this regression specific to the RHEL4 kernels
Is it always true that upon a failure, the system
never returns from acpi_evaluate_object("_PTS")?
I don't have an archive of RHEL kernel trees here;
and it isn't clear that there are analogous upstream
kernel changes associated with this failure.
Can you attach the before/after of the source changes
you made to make the regression go away?
re: comment #11
is the 2600 an example of a system with a simple _PTS
that does not fail?
i wonder if this is due to the changes in acpi_os_sleep()...
2850 works, 2800 does not. Both have simple _PTS methods.
Correct in the testing that I've seen that the system does not return
from _PTS method.
Looks like problem might be in the linux-2.6.9-kexec.patch. At
shutdown, the 8259 is masked off on all interrupts .
The i8259A_shutdown function is a new function
So far I've been able to isolate it into masking off IRQ4 or IRQ5 (Is
this disabling the SMI?)
Huh? i8259A_shutdown? -- in the (remote) event you mean "lapic_shutdown",
then note if this 2.6.10-ism got into RHEL4, then it may need to be
updated per these two 2.6.10 patches:
Where can I find a copy of linux-2.6.9-kexec.patch?
No, masking IRQ5 or IRQ6 should have no effect on SMI.
However, SMM is very tricky, and sometimes it is tricked
out when the OS makes changes to hardware state that
the BIOS didn't expect.
BTW. are all the machines that fail SMP?
If so, do they still fail if booted with maxcpus=1,
or "maxcpus=1" "nolapic"?
Yeah, it's bizzare. There are two lines in i8259A_shutdown, which
mask off the interrupt bits:
If I comment out the first line (0x21) then the system shuts down
properly. Also have tried various combinations.
0xFF - fails
0xF0 - system shuts down
0xC0 - fails
which seems odd to me. Maybe a combination of interrupts working
The problem is seen in both SMP and UP kernels.
The linkx-2.6.9-kexec.patch is in the source of the -648 kernel.
Install the RPM then the patch is in /usr/src/redhat/SOURCES
0xFE works, so wondering is this a timer tick issue? Does the ACPI
code require the timer tick to be active?
Problem occurs during call to acpi_os_sleep. Unfortunately
i8259A_shutdown has already turned off the timer tick interrupt, so
the call to schedule_timeout() in osl.c:acpi_os_sleep never returns.
This causes a hang instead of the system power off.
Any updates on this? Does anyone know why interrupts are masked off
Changing the title to reflect the Update in which a fix for this
issue has been committed or being tracked for..
same issue tracked upstream in mm tree:
I dropped the kexec patch in the latest builds. this should fix the issue.
(it did when the same thing affected Fedora).
Raghavendra from Dell has regressed and confirms that this issue is resolved
in U1 beta. Closing. Thanks!
I am still seeing this issue with a PowerEdge 1600SC (latest A12 bios) and
RHEL4, kernel 2.6.9-5.0.5.EL . Does the "U1 beta" mentioned above containing the
fix imply a later patch/version of the kernel than this?
Yes, U1 beta was kernel-2.6.9-6.37.EL.