127615 – 2.6.7: cfq io scheduler paniced?

Bug 127615 - 2.6.7: cfq io scheduler paniced?

Summary: 2.6.7: cfq io scheduler paniced?

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Arjan van de Ven
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-07-10 23:54 UTC by Kaj J. Niemi
Modified:	2007-11-30 22:10 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-07-29 09:21:06 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
one of the two pdflushes vanished (16.90 KB, image/png) 2004-07-11 00:09 UTC, Kaj J. Niemi	no flags	Details
spectacular load spike (16.09 KB, image/png) 2004-07-11 00:10 UTC, Kaj J. Niemi	no flags	Details
meanwhile interface traffic dropped but did not stop completely (18.69 KB, image/png) 2004-07-11 00:10 UTC, Kaj J. Niemi	no flags	Details
rapid rise of tcp sessions established and in close wait (15.50 KB, image/png) 2004-07-11 00:15 UTC, Kaj J. Niemi	no flags	Details
just the time wait sessions (12.34 KB, image/png) 2004-07-11 00:18 UTC, Kaj J. Niemi	no flags	Details
View All

Description Kaj J. Niemi 2004-07-10 23:54:51 UTC

Description of problem:
A four-way (2x Xeon DP with HT) system paniced tonight in the
following fashion:

Unable to handle kernel NULL pointer dereference at virtual address
00000000
 printing eip:
0221fef1
*pde = 00003001
Oops: 0000 [#1]
SMP
Modules linked in: iptable_filter e1000 ipt_REDIRECT iptable_nat
ip_conntrack ip_tables floppy sg microcode dm_mod button battery
asus_acpi ac ext3 jbd sata_sil libata sd_mod scsi_mod
CPU:    2
EIP:    0060:[<0221fef1>]    Not tainted
EFLAGS: 00010293   (2.6.7-1.476smp)
EIP is at cfq_get_queue+0x28/0x98
eax: 00000000   ebx: 00000034   ecx: c1c834d8   edx: 00000000
esi: 00002008   edi: 00000220   ebp: 04d6dc64   esp: 19aa4cb0
ds: 007b   es: 007b   ss: 0068
Process pdflush (pid: 8200, threadinfo=19aa4000 task=39ec17b0)
Stack: 04d6dc64 00000220 61f1bd10 04dbe3ac 022201f7 022201d7 04d91bb4
00000001
       022173ef 61f1bd10 02218f7e 04d91c40 00000000 00000220 00000008
00000000
       00000000 0000007b c1069704 04d91bb4 00000008 00000000 02219b72
1579dd17
Call Trace:
 [<022201f7>] cfq_set_request+0x20/0x63
 [<022201d7>] cfq_set_request+0x0/0x63
 [<022173ef>] elv_set_request+0xa/0x17
 [<02218f7e>] get_request+0x18b/0x2b0
 [<02219b72>] __make_request+0x2de/0x4d6
 [<02219ef6>] generic_make_request+0x18c/0x19c
 [<02219fd0>] submit_bio+0xca/0xd2
 [<02160092>] submit_bh+0x60/0x103
 [<0215ecb8>] __block_write_full_page+0x1dd/0x2c4
 [<02162d62>] blkdev_get_block+0x0/0x46
 [<0215ffcd>] block_write_full_page+0xc5/0xce
 [<02162d62>] blkdev_get_block+0x0/0x46
 [<0217cef1>] mpage_writepages+0x157/0x272
 [<02162e45>] blkdev_writepage+0x0/0xc
 [<02141fe4>] do_writepages+0x19/0x27
 [<0217b48c>] __sync_single_inode+0x84/0x1f8
 [<02129cbf>] process_timeout+0x0/0x5
 [<0217b6db>] __writeback_single_inode+0xdb/0xe1
 [<0217b887>] sync_sb_inodes+0x1a6/0x2be
 [<02142ab0>] pdflush+0x0/0x1e
 [<0217baaa>] writeback_inodes+0x10b/0x1ce
 [<02161ba3>] sync_supers+0xf7/0x137
 [<02141e7f>] wb_kupdate+0x89/0xec
 [<021429c5>] __pdflush+0x1b9/0x2a4
 [<02142aca>] pdflush+0x1a/0x1e
 [<02141df6>] wb_kupdate+0x0/0xec
 [<02142ab0>] pdflush+0x0/0x1e
 [<0213458d>] kthread+0x73/0x9b
 [<0213451a>] kthread+0x0/0x9b
 [<021041f1>] kernel_thread_helper+0x5/0xb
Code: 8b 02 0f 18 00 90 39 ca 74 0d 39 72 14 75 04 89 d0 eb 06 8b
 <6>TCP: too many of orphaned sockets
TCP: too many of orphaned sockets
TCP: too many of orphaned sockets
TCP: too many of orphaned sockets
TCP: too many of orphaned sockets

The load spiked to an artificial value of about 250.

The interesting thing to note was that console sessions (OOB) worked
great, the UDP side of the network stack worked great (snmp requests
were serviced), it responded to ICMP requests normally but the TCP
stack got completely hosed and all connections were refused.
Connections already established were kept in the state tables but were
not serviced at all.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.7-1.476

Additional info:

Comment 1 Kaj J. Niemi 2004-07-11 00:09:39 UTC

Created attachment 101783 [details]
one of the two pdflushes vanished

There are two pdflush pseudo-processes running of which one died according to
the panic message.

Comment 2 Kaj J. Niemi 2004-07-11 00:10:05 UTC

Created attachment 101784 [details]
spectacular load spike

the load started going up at the same time

Comment 3 Kaj J. Niemi 2004-07-11 00:10:46 UTC

Created attachment 101785 [details]
meanwhile interface traffic dropped but did not stop completely

Comment 4 Kaj J. Niemi 2004-07-11 00:15:05 UTC

Created attachment 101786 [details]
rapid rise of tcp sessions established and in close wait

tcp sessions in time wait are not shown as they would throw off the graph.

Comment 5 Kaj J. Niemi 2004-07-11 00:18:25 UTC

Created attachment 101787 [details]
just the time wait sessions

tcp sessions in time wait dropped to zero at the same time close wait sessions
flatlined (about 00:20 local time in the graph)

Comment 6 Kaj J. Niemi 2004-07-13 13:30:27 UTC

Happened again on another otherwise identical system. Unfortunately
the console was dead so there wasn't anything on it. The following got
logged to syslog, though.

kernel: Debug: sleeping function called from invalid context at
mm/mempool.c:197
kernel: in_atomic():0[expected: 0], irqs_disabled():1
kernel:  [<0211f978>] __might_sleep+0x7d/0x87
kernel:  [<0213f8f3>] mempool_alloc+0x6a/0x198
kernel:  [<021441e6>] poison_obj+0x1d/0x3d
kernel:  [<0211ff27>] autoremove_wake_function+0x0/0x2d
kernel:  [<02145982>] cache_alloc_debugcheck_after+0xcf/0x103
kernel:  [<0211ff27>] autoremove_wake_function+0x0/0x2d
kernel:  [<0213f904>] mempool_alloc+0x7b/0x198
kernel:  [<02220c34>] __cfq_get_queue+0x53/0x98
kernel:  [<02220cc8>] cfq_get_queue+0x4f/0x86
kernel:  [<02220f95>] cfq_set_request+0x20/0x63
kernel:  [<02220f75>] cfq_set_request+0x0/0x63
kernel:  [<02218107>] elv_set_request+0xa/0x17
kernel:  [<02219c82>] get_request+0x18b/0x2b0
kernel:  [<02219e24>] get_request_wait+0x7d/0xb9
kernel:  [<0211ff27>] autoremove_wake_function+0x0/0x2d

Seems to be rather unreliably reproduceable just by installing a new
kernel with "rpm -ivh kernel-*.rpm" with 476, 478 and 481 kernels.

Comment 7 Warren Togami 2004-07-13 13:36:41 UTC

In order to be more certain and isolate the problem, can you try
booting with the anticipator or deadline elevator instead and see if
it survives?

Comment 8 Kaj J. Niemi 2004-07-13 13:40:19 UTC

Would that be elevator=anticipatory and/or elevator=deadline ?

Comment 9 Kaj J. Niemi 2004-07-13 13:45:06 UTC

Hmmm I think anticipatory is actually "elevator=as".

Comment 10 Kaj J. Niemi 2004-07-16 10:50:52 UTC

Booting with elevator=deadline has had the server up for 2+ days with
simulated load (about 600 java threads, net-snmp full table walks
against it). I installed 492, booted with elevator=deadline and will
see what happens.

Comment 11 Kaj J. Niemi 2004-07-16 10:54:27 UTC

Btw, all the panicing systems are Supermicro 6013P-T systems. A lot of
companies also OEM these and sell them as their own.

Comment 12 Kaj J. Niemi 2004-07-26 12:38:24 UTC

With elevator=deadline the uptimes are now around 12+ days

Comment 13 Warren Togami 2004-07-26 12:41:00 UTC

If this issue is not resolved with the latest rawhide kernels, you can
help by bringing this report to the attention of upstream lkml and the
CFQ author Jens Axboe.

Note You need to log in before you can comment on or make changes to this bug.