Bug 195878

Summary: "in_atomic():1, irqs_disabled():0" kernel oops on kernel 2.6.16-1.2115_FC4smp on x86_64
Product: [Fedora] Fedora Reporter: Aleksander Adamowski <bugs-redhat>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4CC: pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-06-26 14:56:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aleksander Adamowski 2006-06-19 06:36:39 UTC
Description of problem:

On a HP DL385 Dual Core Opteron server running Fedora Core 4, I've upgraded the
kernel to version 2.6.16-1.2115_FC4smp a couple of days ago.

Now it outputs the following error in kernel log every couple of hours:

Jun 19 00:04:31 hostname kernel: in_atomic():1, irqs_disabled():0
Jun 19 00:04:31 hostname kernel:
Jun 19 00:04:31 hostname kernel: Call Trace: <IRQ>
<ffffffff8015c58a>{audit_log_exit+416}
Jun 19 00:04:31 hostname kernel:        <ffffffff8015d9a3>{audit_free+282}
<ffffffff801329eb>{__put_task_struct_cb+205}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8014838e>{__rcu_process_callbacks+303}
<ffffffff8014843e>{rcu_process_callbacks+35}
Jun 19 00:04:31 hostname kernel:        <ffffffff8013b1bd>{tasklet_action+102}
<ffffffff8013b30d>{__do_softirq+96}
Jun 19 00:04:31 hostname kernel:        <ffffffff8010bdde>{call_softirq+30}
<ffffffff8010cd6c>{do_softirq+44}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8010b738>{apic_timer_interrupt+132} <EOI>
<ffffffff80175290>{__anon_vma_link+42}
Jun 19 00:04:31 hostname kernel:        <ffffffff8017101d>{vma_adjust+1016}
<ffffffff8014ad20>{autoremove_wake_function+0}
Jun 19 00:04:31 hostname kernel:        <ffffffff801711df>{split_vma+303}
<ffffffff801718d7>{do_munmap+299}
Jun 19 00:04:31 hostname kernel:        <ffffffff8035ac0c>{__down_write+55}
<ffffffff8035b268>{_spin_lock_irqsave+9}
Jun 19 00:04:31 hostname kernel:        <ffffffff80171aa5>{sys_munmap+81}
<ffffffff8010ab71>{tracesys+209}
Jun 19 00:04:31 hostname kernel:
Jun 19 00:04:31 hostname kernel: Call Trace: <IRQ>
<ffffffff80358ec1>{schedule+125} <ffffffff801362fc>{printk+84}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8015346d>{module_text_address+51} <ffffffff8035abb5>{__down_read+180}
Jun 19 00:04:31 hostname kernel:        <ffffffff8015c592>{audit_log_exit+424}
<ffffffff8015d9a3>{audit_free+282}
Jun 19 00:04:31 hostname kernel:       
<ffffffff801329eb>{__put_task_struct_cb+205}
<ffffffff8014838e>{__rcu_process_callbacks+303}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8014843e>{rcu_process_callbacks+35} <ffffffff8013b1bd>{tasklet_action+102}
Jun 19 00:04:31 hostname kernel:        <ffffffff8013b30d>{__do_softirq+96}
<ffffffff8010bdde>{call_softirq+30}
Jun 19 00:04:31 hostname kernel:        <ffffffff8010cd6c>{do_softirq+44}
<ffffffff8010b738>{apic_timer_interrupt+132} <EOI>
Jun 19 00:04:31 hostname kernel:        <ffffffff80175290>{__anon_vma_link+42}
<ffffffff8017101d>{vma_adjust+1016}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8014ad20>{autoremove_wake_function+0} <ffffffff801711df>{split_vma+303}
Jun 19 00:04:31 hostname kernel:        <ffffffff801718d7>{do_munmap+299}
<ffffffff8035ac0c>{__down_write+55}
Jun 19 00:04:31 hostname kernel:        <ffffffff8035b268>{_spin_lock_irqsave+9}
<ffffffff80171aa5>{sys_munmap+81}
Jun 19 00:04:31 hostname kernel:        <ffffffff8010ab71>{tracesys+209}

Also, when the following error has co-occured in the logs, the kernel panicked:

Jun 19 00:04:31 hostname kernel: scheduling while atomic: spamd/0x00000100/23927

This is quite a serious bug.

Version-Release number of selected component (if applicable):

kernel-2.6.16-1.2115_FC4.x86_64

How reproducible:

100% every couple of hours

Comment 1 Aleksander Adamowski 2006-06-19 06:38:20 UTC
Raising severity if you don't mind.

Comment 2 Aleksander Adamowski 2006-06-19 06:42:03 UTC
The stacktrace I've posted was from the panic, not from the usual oops.

The stacktrace from the oops looks like this:

Jun 19 08:46:48 hostname kernel: in_atomic():1, irqs_disabled():0
Jun 19 08:46:48 hostname kernel:
Jun 19 08:46:48 hostname kernel: Call Trace: <IRQ>
<ffffffff8015c58a>{audit_log_exit+416}
Jun 19 08:46:48 hostname kernel:        <ffffffff8015d9a3>{audit_free+282}
<ffffffff801329eb>{__put_task_struct_cb+205}
Jun 19 08:46:48 hostname kernel:       
<ffffffff8014838e>{__rcu_process_callbacks+303}
<ffffffff8014843e>{rcu_process_callbacks+35}
Jun 19 08:46:48 hostname kernel:        <ffffffff8013b1bd>{tasklet_action+102}
<ffffffff8013b30d>{__do_softirq+96}
Jun 19 08:46:48 hostname kernel:        <ffffffff8010bdde>{call_softirq+30}
<ffffffff8010cd6c>{do_softirq+44}
Jun 19 08:46:48 hostname kernel:       
<ffffffff8010b738>{apic_timer_interrupt+132} <EOI>


Comment 3 Aleksander Adamowski 2006-06-19 06:53:37 UTC
To see bigger picture of recent kernel regressions, consider the following:

2.6.16-1.2096_FC4smp worked stable on this server,
2.6.16-1.2111_FC4smp exhibited bug 195876 (not severe),
2.6.16-1.2115_FC4smp exhibits bug 195878 (severe).

Situation is getting worse and worse with the last 2 kernel releases for FC4.


Comment 4 Dave Jones 2006-06-26 14:56:39 UTC
should be fixed in the 2.6.17 update.