123137 – Assertion failure in log_do_checkpoint

Bug 123137 - Assertion failure in log_do_checkpoint

Summary: Assertion failure in log_do_checkpoint

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Stephen Tweedie
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	162814
TreeView+	depends on / blocked

Reported:	2004-05-12 20:58 UTC by Dan Christian
Modified:	2007-11-30 22:10 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-09-10 15:08:14 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
oops output (2.10 KB, text/plain) 2004-05-12 21:01 UTC, Dan Christian	no flags	Details
View All

Description Dan Christian 2004-05-12 20:58:26 UTC

Description of problem:
Assertion failure in log_do_checkpoint() at fs/jbd/checkpoint.c:361:
"drop_count != 0 || cleanup_ret != 0"
------------[ cut here ]------------
kernel BUG at fs/jbd/checkpoint.c:361!

Version-Release number of selected component (if applicable):
2.6.5-1.327

How reproducible:
Rare

System was a dual Xeon with AMI Megaraid RAID controller.  File
systems are Ext3.

I'll attach the oops output in a second.

Comment 1 Dan Christian 2004-05-12 21:01:27 UTC

Created attachment 100200 [details]
oops output

Oops output when this happened.
The system load was probably 3ish.
Uptime was less than a day (due to an un-related reboot)

Comment 2 Dave Jones 2004-05-13 14:35:15 UTC

There were quite a few ext3 related changes in later kernels.
I'm not guaranteeing they fix this problem, but it makes more sense to
test -358 if you can.

Comment 3 Stephen Tweedie 2004-09-10 15:08:14 UTC

No information given about later kernels, so closing: please reopen if
you can still reproduce this problem.

Comment 4 Stephen Frost 2004-10-19 19:58:26 UTC

I've been bit by this problem under both 2.6.8.1 and 2.6.9 now.  I
don't have an oops from 2.6.9 yet (unfortunately, I'll check once I
get home and see if it got logged over the serial console) but here is
one from 2.6.8.1:
Assertion failure in log_do_checkpoint() at fs/jbd/checkpoint.c:361:
"drop_count != 0 || cleanup_ret
+!= 0"
kernel BUG at fs/jbd/checkpoint.c:361!
invalid operand: 0000 [#1]
Oops:
------------[ cut here ]------------
SMP
Modules linked in: ipt_REDIRECT ipt_REJECT iptable_nat iptable_mangle
iptable_filter ipt_state
+ipt_pkttype ipt_physdev ipt_multiport ipt_conntrack ipt_MARK ipt_LOG
ip_conntrack ip_tables 8250
+serial_core snd_intel8x0 s
nd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_mpu401_uart
snd_rawmidi snd_seq_device snd
+soundcore ehci_hcd uhci_hcd usbcore intel_agp agpgart eeprom lm85
i2c_sensor i2c_i801 i2c_dev
+i2c_core pcspkr
CPU:    1
EIP:    0060:[log_do_checkpoint+364/459]    Not tainted
EFLAGS: 00010286   (2.6.8.1-vs1.9.2kenobi.3)
EIP is at log_do_checkpoint+0x16c/0x1cb
eax: 0000006e   ebx: 00000000   ecx: c036ad04   edx: c036ad04
esi: 00000000   edi: 00000001   ebp: c932d83c   esp: e3a9fd0c
ds: 007b   es: 007b   ss: 0068
Process sendmail (pid: 9628, threadinfo=e3a9e000 task=e10c3770)
Stack: c03323c0 c031be9d c03301f7 00000169 c0335200 00294867 c1a87180
00000000
       00000000 e498574c c0476120 00000000 00000003 c180c0a0 c180cd60
c015a341
       dedcbf5c dedcbf5c dedcbf5c f314ae3c dedcbf5c c01a60ac f700f4e0
f314ae3c
Call Trace:
 [wake_up_buffer+23/83] wake_up_buffer+0x17/0x53
 [do_get_write_access+645/1583] do_get_write_access+0x285/0x62f
 [wake_up_buffer+23/83] wake_up_buffer+0x17/0x53
 [find_busiest_group+234/806] find_busiest_group+0xea/0x326
 [ext3_do_update_inode+517/1094] ext3_do_update_inode+0x205/0x446
 [radix_tree_delete+325/398] radix_tree_delete+0x145/0x18e
 [__log_wait_for_space+199/218] __log_wait_for_space+0xc7/0xda
 [start_this_handle+290/954] start_this_handle+0x122/0x3ba
 [find_get_pages+55/90] find_get_pages+0x37/0x5a
 [pagevec_lookup+46/56] pagevec_lookup+0x2e/0x38
 [truncate_inode_pages+289/696] truncate_inode_pages+0x121/0x2b8
 [journal_start+171/210] journal_start+0xab/0xd2
 [locks_delete_lock+139/221] locks_delete_lock+0x8b/0xdd
 [start_transaction+35/88] start_transaction+0x23/0x58
 [locks_remove_posix+239/268] locks_remove_posix+0xef/0x10c
 [ext3_delete_inode+0/230] ext3_delete_inode+0x0/0xe6
 [ext3_delete_inode+39/230] ext3_delete_inode+0x27/0xe6
 [ext3_delete_inode+0/230] ext3_delete_inode+0x0/0xe6
 [generic_delete_inode+147/316] generic_delete_inode+0x93/0x13c
 [iput+98/124] iput+0x62/0x7c
 [dput+231/403] dput+0xe7/0x193
 [__fput+179/260] __fput+0xb3/0x104
 [filp_close+89/134] filp_close+0x59/0x86
 [sys_close+94/113] sys_close+0x5e/0x71
 [syscall_call+7/11] syscall_call+0x7/0xb
Code: 0f 0b 69 01 f7 01 33 c0 eb b8 8d 44 24 1c 8d 54 24 24 89 44

Comment 5 Stephen Frost 2004-10-20 00:11:59 UTC

Alright, just happened again, that's twice in one day...

Note You need to log in before you can comment on or make changes to this bug.