Bug 195878

Summary:	"in_atomic():1, irqs_disabled():0" kernel oops on kernel 2.6.16-1.2115_FC4smp on x86_64
Product:	[Fedora] Fedora	Reporter:	Aleksander Adamowski <bugs-redhat>
Component:	kernel	Assignee:	Dave Jones <davej>
Status:	CLOSED ERRATA	QA Contact:	Brian Brock <bbrock>
Severity:	high	Docs Contact:
Priority:	medium
Version:	4	CC:	pfrields, wtogami
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2006-06-26 14:56:39 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Aleksander Adamowski 2006-06-19 06:36:39 UTC

Description of problem:

On a HP DL385 Dual Core Opteron server running Fedora Core 4, I've upgraded the
kernel to version 2.6.16-1.2115_FC4smp a couple of days ago.

Now it outputs the following error in kernel log every couple of hours:

Jun 19 00:04:31 hostname kernel: in_atomic():1, irqs_disabled():0
Jun 19 00:04:31 hostname kernel:
Jun 19 00:04:31 hostname kernel: Call Trace: <IRQ>
<ffffffff8015c58a>{audit_log_exit+416}
Jun 19 00:04:31 hostname kernel:        <ffffffff8015d9a3>{audit_free+282}
<ffffffff801329eb>{__put_task_struct_cb+205}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8014838e>{__rcu_process_callbacks+303}
<ffffffff8014843e>{rcu_process_callbacks+35}
Jun 19 00:04:31 hostname kernel:        <ffffffff8013b1bd>{tasklet_action+102}
<ffffffff8013b30d>{__do_softirq+96}
Jun 19 00:04:31 hostname kernel:        <ffffffff8010bdde>{call_softirq+30}
<ffffffff8010cd6c>{do_softirq+44}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8010b738>{apic_timer_interrupt+132} <EOI>
<ffffffff80175290>{__anon_vma_link+42}
Jun 19 00:04:31 hostname kernel:        <ffffffff8017101d>{vma_adjust+1016}
<ffffffff8014ad20>{autoremove_wake_function+0}
Jun 19 00:04:31 hostname kernel:        <ffffffff801711df>{split_vma+303}
<ffffffff801718d7>{do_munmap+299}
Jun 19 00:04:31 hostname kernel:        <ffffffff8035ac0c>{__down_write+55}
<ffffffff8035b268>{_spin_lock_irqsave+9}
Jun 19 00:04:31 hostname kernel:        <ffffffff80171aa5>{sys_munmap+81}
<ffffffff8010ab71>{tracesys+209}
Jun 19 00:04:31 hostname kernel:
Jun 19 00:04:31 hostname kernel: Call Trace: <IRQ>
<ffffffff80358ec1>{schedule+125} <ffffffff801362fc>{printk+84}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8015346d>{module_text_address+51} <ffffffff8035abb5>{__down_read+180}
Jun 19 00:04:31 hostname kernel:        <ffffffff8015c592>{audit_log_exit+424}
<ffffffff8015d9a3>{audit_free+282}
Jun 19 00:04:31 hostname kernel:       
<ffffffff801329eb>{__put_task_struct_cb+205}
<ffffffff8014838e>{__rcu_process_callbacks+303}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8014843e>{rcu_process_callbacks+35} <ffffffff8013b1bd>{tasklet_action+102}
Jun 19 00:04:31 hostname kernel:        <ffffffff8013b30d>{__do_softirq+96}
<ffffffff8010bdde>{call_softirq+30}
Jun 19 00:04:31 hostname kernel:        <ffffffff8010cd6c>{do_softirq+44}
<ffffffff8010b738>{apic_timer_interrupt+132} <EOI>
Jun 19 00:04:31 hostname kernel:        <ffffffff80175290>{__anon_vma_link+42}
<ffffffff8017101d>{vma_adjust+1016}
Jun 19 00:04:31 hostname kernel:       
<ffffffff8014ad20>{autoremove_wake_function+0} <ffffffff801711df>{split_vma+303}
Jun 19 00:04:31 hostname kernel:        <ffffffff801718d7>{do_munmap+299}
<ffffffff8035ac0c>{__down_write+55}
Jun 19 00:04:31 hostname kernel:        <ffffffff8035b268>{_spin_lock_irqsave+9}
<ffffffff80171aa5>{sys_munmap+81}
Jun 19 00:04:31 hostname kernel:        <ffffffff8010ab71>{tracesys+209}

Also, when the following error has co-occured in the logs, the kernel panicked:

Jun 19 00:04:31 hostname kernel: scheduling while atomic: spamd/0x00000100/23927

This is quite a serious bug.

Version-Release number of selected component (if applicable):

kernel-2.6.16-1.2115_FC4.x86_64

How reproducible:

100% every couple of hours

Comment 1 Aleksander Adamowski 2006-06-19 06:38:20 UTC

Raising severity if you don't mind.

Comment 2 Aleksander Adamowski 2006-06-19 06:42:03 UTC

The stacktrace I've posted was from the panic, not from the usual oops.

The stacktrace from the oops looks like this:

Jun 19 08:46:48 hostname kernel: in_atomic():1, irqs_disabled():0
Jun 19 08:46:48 hostname kernel:
Jun 19 08:46:48 hostname kernel: Call Trace: <IRQ>
<ffffffff8015c58a>{audit_log_exit+416}
Jun 19 08:46:48 hostname kernel:        <ffffffff8015d9a3>{audit_free+282}
<ffffffff801329eb>{__put_task_struct_cb+205}
Jun 19 08:46:48 hostname kernel:       
<ffffffff8014838e>{__rcu_process_callbacks+303}
<ffffffff8014843e>{rcu_process_callbacks+35}
Jun 19 08:46:48 hostname kernel:        <ffffffff8013b1bd>{tasklet_action+102}
<ffffffff8013b30d>{__do_softirq+96}
Jun 19 08:46:48 hostname kernel:        <ffffffff8010bdde>{call_softirq+30}
<ffffffff8010cd6c>{do_softirq+44}
Jun 19 08:46:48 hostname kernel:       
<ffffffff8010b738>{apic_timer_interrupt+132} <EOI>

Comment 3 Aleksander Adamowski 2006-06-19 06:53:37 UTC

To see bigger picture of recent kernel regressions, consider the following:

2.6.16-1.2096_FC4smp worked stable on this server,
2.6.16-1.2111_FC4smp exhibited bug 195876 (not severe),
2.6.16-1.2115_FC4smp exhibits bug 195878 (severe).

Situation is getting worse and worse with the last 2 kernel releases for FC4.

Comment 4 Dave Jones 2006-06-26 14:56:39 UTC

should be fixed in the 2.6.17 update.