Bug 123332

Summary:

Kernel panic in shrink_cache/ __remove_inode_page / refile_inode

Product:

[Fedora] Fedora

Reporter:

Aleksander Adamowski <bugs-redhat>

Component:

kernel

Assignee:

Arjan van de Ven <arjanv>

Status:

CLOSED WONTFIX

QA Contact:

Severity:

high

Docs Contact:

Priority:

medium

Version:

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2004-09-29 20:26:29 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Differences between 3Ware driver v1.02.00.036 and v1.02.00.037	none
Photo of console with kernel stacktrace after panic	none
dmesg file from the machine	none
output from lspci -vv	none
output from dmidecode	none
/et/sysconfig/hwconf file	none
Another kernel panic	none
Another kernel panic that occured today on kernel-2.4.22-1.2188.nptlsmp.	none
Yesterday's panic on kernel-2.4.22-1.2190.nptlsmp	none
refile_inode kernel-2.4.22-1.2190 panic from today	none
decoded oops from this panic	none
patch which implement's Rik's inode reclaim pach from 2.4.25	none

Description Aleksander Adamowski 2004-05-17 09:35:59 UTC

Description of problem:

Recently (seems after Fedora kernel revision 2188 release) we've been
experiencing frequent kernel panics on one of our production systems.

The frequency seems to be around 1-2 times a week, but today we've
experienced two of them in a row, the second one occuring only half an
hour after the first.

Using a digital camera I've managed to take a screenshot of the
console where a stacktrace is visible.

Stacktrace indicates that panic occured in shrink_cache function. The
stacktrace shows the following function calls:
shrink_cache
shrink_caches
try_to_free_pages_zone
ip_local_deliver
balance_classzone
__alloc_pages
__alloc_pages
do_wp_page
do_swap_page
handle_mm_fault
do_page_fault
generic_file_new_read
file_read_actor
generic_file_read
sys_pread
smp_apic_timer_interrupt
do_page_fault
error_code

I'll attach the screenshot and some files related to system and hw
configuration shortly.


Version-Release number of selected component (if applicable):

2.4.22-1.2188.nptlsmp


How reproducible:
Random, occurs about 1-2 times a week


Additional info:

Probably the most relevant additional info is that the system runs on
hardware RAID5 storage on 3Ware 8506-4LP controller.

The driver for that controller isn't the one shipped with Fedora
kernel (3Ware driver  v1.02.00.036), but the latest version manually
compiled from sources (v1.02.00.037), but the difference between these
2 versions of the driver is minimal (I'll attach a diff between those
two).

No other modifications to the stock Fedora kernel and kernel modules
have been made.

Comment 1 Aleksander Adamowski 2004-05-17 09:43:04 UTC

Created attachment 100261 [details]
Differences between 3Ware driver v1.02.00.036 and v1.02.00.037

v1.02.00.036 is in stock Fedora kernel, but we have to use the latest
v1.02.00.037.

We had to update the controller's firmware because it hardlocked, and 3Ware
highly advises updating OS driver to the latest version even before updating
the controller's firmware.

Comment 2 Aleksander Adamowski 2004-05-17 09:46:11 UTC

Created attachment 100262 [details]
Photo of console with kernel stacktrace after panic

Comment 3 Aleksander Adamowski 2004-05-17 09:53:05 UTC

Created attachment 100263 [details]
dmesg file from the machine

Comment 4 Aleksander Adamowski 2004-05-17 09:53:58 UTC

Created attachment 100264 [details]
output from lspci -vv

Comment 5 Aleksander Adamowski 2004-05-17 09:54:18 UTC

Created attachment 100265 [details]
output from dmidecode

Comment 6 Aleksander Adamowski 2004-05-17 09:58:25 UTC

More detailed hardware specification:

CPU: dual PIV Xeon 2GHz with hyperthreading (4 virtual CPUs)
RAM: 1 GB (2 x 512MB DDR Kingstone with parity control)
Motherboard: Intel SE7501BR2
NIC: INTEL pro100 Server Board integrateon on the motherboard
Storage: Hardware RAID 5 array on 3Ware 8506-4LP controller, built
from 4 Serial ATA Seagate Serial ATA 120GB drives

Comment 7 Aleksander Adamowski 2004-05-17 09:58:55 UTC

Created attachment 100266 [details]
/et/sysconfig/hwconf file

Comment 8 Aleksander Adamowski 2004-05-26 10:43:22 UTC

Created attachment 100572 [details]
Another kernel panic

This one occured today, this time the system was running in higher resolution
text mode, so I was able to capture the full text of kernel panic.

Comment 9 Aleksander Adamowski 2004-05-28 11:39:07 UTC

Possibly related is the bug 121732...

Comment 10 Aleksander Adamowski 2004-05-28 14:11:14 UTC

Created attachment 100666 [details]
Another kernel panic that occured today on kernel-2.4.22-1.2188.nptlsmp.

After this panic I've installed updated kernel 2.4.22-1.2190.nptlsmp which
apparently resolves the problem (accorging to bug 121732).

Comment 11 Aleksander Adamowski 2004-05-31 19:29:00 UTC

Another panic in refile_inode occured just today on
kernel-2.4.22-1.2190.nptlsmp.

The problem has not been resolved.

I'll attach a screenshot next morning.

Comment 12 Aleksander Adamowski 2004-06-01 09:05:42 UTC

Created attachment 100732 [details]
Yesterday's panic on kernel-2.4.22-1.2190.nptlsmp

Comment 13 Aleksander Adamowski 2004-06-01 11:05:46 UTC

Here's the text of the latest panic with 2190 for better searchability
and readability:

Unable to handle kernel NULL pointer dereference at virtual address
00000000
 printing eip:
c01691ae
*pde = 0e723067
*pte = 00000000
Oops: 0002
e100 iptable_mangle ipt_REJECT ipt_multiport ipt_state ip_conntrack
iptable_filter ip_tables floppy sg microcode keybdev mousedev
hid input usb-uhci usbcore e
CPU:    2
EIP:    0060:[<c01691ae>]    Not tainted
EFLAGS: 00010246

EIP is at refile_inode [kernel] 0x4e (2.4.22-1.2190.nptlsmp)
eax: 00000000   ebx: e28fb900   ecx: 00000000   edx: e28fb908
esi: c0376028   edi: c0374fd8   ebp: 0000772e   esp: c3193de0
ds: 0068   es: 0068   ss: 0068
Process spamd (pid: 31686, stackpage=c3193000)
Stack: c19187a0 e20fb9c4 c013c642 e28fb900 c19187a0 00000000 c19187a0
c01461ba
       c19187a0 000001d2 c3192000 000005c3 000001d2 00000012 0000001d
000001d2
       c0374fd8 c0374fd8 c01464aa c3193e4c 000001d2 0000003c 00000020
c0146522
Call Trace:   [<c013c642>] __remove_inode_page [kernel] 0x82 (0x3c193de8)
[<c01461ba>] shrink_cache [kernel] 0x30a (0xc3193dfc)
[<c01464aa>] shrink_caches [kernel] 0x4a (0xc3193e28)
[<c0146522>] try_to_free_pages_zone [kernel] 0x62 (0xc3193e3c)
[<c0147102>] balance_classzone [kernel] 0x52 (0xc3193e60)
[<c0147438>] __alloc_pages [kernel] 0x188 (0xc3193e7c)
[<c010e968>] call_do_IRQ [kernel] 0x5 (0xc3193e88)
[<c0139b5f>] do_wp_page [kernel] 0x6f (0xc3193ebc)
[<c013a666>] handle_mm_fault [kernel] 0x106 (0xc3193ee0)
[<c011c94c>] do_page_fault [kernel] 0x14c (0xc3193f0c)
[<c011e9c0>] scheduler_tick [kernel] 0x120 (0xc3193f28)
[<c0107b3f>] __switch_to [kernel] 0x16f (0xc3193f44)
[<c011ed8f>] schedule [kernel] 0x7f (0xc3193f68)
[<c012e42e>] update_process_times [kernel] 0x3e (0xc3193f84)
[<c011c800>] do_page_fault [kernel] 0x0 (0xc3193fb0)
[<c0109c18>] error_code [kernel] 0x34 (0xc3193fb8)

Code: 89 01 c7 43 08 00 00 00 00 89 48 04 8b 06 89 50 04 89 43 08

Comment 14 Aleksander Adamowski 2004-06-04 09:30:52 UTC

Created attachment 100862 [details]
refile_inode kernel-2.4.22-1.2190 panic from today

Comment 15 Aleksander Adamowski 2004-06-17 07:49:27 UTC

Possible fix: 3Ware support engineer has pointed out that this issue
may have been fixed in kernel 2.4.26:

"In the changelog for 2.4.26, there was a bug in refile_inode() that
was fixed.
I would recommend you try this kernel.
Below is the changelog:
Marcelo Tosatti:
Trond: Avoid refile_inode() from putting locked inodes on the dirty list
Changed EXTRAVERSION to -rc1"

Comment 16 Dave Jones 2004-06-17 11:02:55 UTC

that patch was merged in the 2190 kernel.
It made no difference.

Comment 17 Aleksander Adamowski 2004-06-18 07:56:24 UTC

I've asked the author of that patch about the new issue, here is his
response:

---SNIP---
PÃ¥ to , 17/06/2004 klokka 11:01, skreiv Aleksander Adamowski:

>> Hi!
>> 
>> I've seen that you've fixed a bug in linux kernel 2.4 related to
refile_inode() (fix applied to kernel-2.4.26).
>> 
>> There's still a related nasty crasher bug in refile_inode(), see
this Redhat Bugzilla bug:
>> http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=123332
>> 


I'm not really a VFS person. That said, it looks to me from the dump you
sent that refile_inode is calling list_del(&inode->i_list) on an inode
that has already been removed from all lists. Normally, such an inode is
supposed to be marked as I_FREEING...

I couldn't find any code in the 2.4.27-pre series that appeared to be
able to put the inode in this bogus state. Somebody else will have to
audit the RedHat kernels to see if they have any such bugs. 8-)

Cheers,
  Trond
---SNIP---

Comment 18 Andrew Ryan 2004-08-18 00:36:01 UTC

We just saw the same bug, note that it is with the 2179 kernel with
the refile_inode patch. We can't duplicate this patch in QA yet, only
in production :(

I'll attach a decoded oops.

Comment 19 Andrew Ryan 2004-08-18 00:37:37 UTC

Created attachment 102818 [details]
decoded oops from this panic

Decoded oops from the panic we had that appears to be the same as the submitter
of this bug, and different from issue 121732.

Comment 20 Andrew Ryan 2004-08-19 21:50:03 UTC

Our system experiencing this problem is a HP DL380 G3 with:
* 2x2.8GHz Xeon processors with Hyperthreading on.
* 4GB RAM
* Internal HP hardware RAID-1 (cciss driver)

So it would appear that the 3Ware driver is not the problem.

Since Trond is rarely wrong, I'm assuming that the problem here has
been fixed in Linux 2.4.27, and that means the problem was fixed
sometime between the 2.4.22-23 series and 2.4.27, but that the fix was
not merged into the FC1 kernel.

Going through the kernel changelogs on kernel.org  line by line, I
found two changesets that appear to be significant. Since I am not a
kernel hacker I cannot confirm that the errors we're experiencing
could be caused by the lack of the two patches referenced below, but I
have a feeling that a kernel hacker with VFS knowledge could confirm
this relatively quickly. In particular, fixed in 2.4.25-pre7 (2.4.25
rel) by Rik Van Riel with comment:
"some more fixes for fs/inode.c inode reclaiming changes"

This bug does exactly what Trond refers to, calling
list_del(&inode->i_list), the question is, could that inode have
already been removed from all lists, that I do not know.

Rik's original post:
http://www.ussg.iu.edu/hypermail/linux/kernel/0401.2/0962.html
David Woodhouse's followup and approval:
http://www.ussg.iu.edu/hypermail/linux/kernel/0401.2/0970.html

A diff of the fs/inode.c code that is the result of the above mailing
list postings:
http://source.scl.ameslab.gov:14690/linux-2.4-for-marcelo-ppc64/diffs/fs/inode.c@1.50?nav=index.html|ChangeSet@-9M|cset@1.1330|hist/fs/inode.c

In addition there is a second inode cache related bugfix that seems
like it belongs in the FC1 kernel, also from 2.4.25:
Fixed by David Woodhouse
"Do not leave inodes with stale waitqueue on slab cache"
http://source.scl.ameslab.gov:14690/linux-2.4-for-marcelo-ppc64/diffs/fs/inode.c@1.50?nav=index.html|ChangeSet@-9M|cset@1.1330|hist/fs/inode.c

Both of the above patches apply cleanly to the 2179-2199 kernels
(fs/inode.c wasn't changed between those versions).

My biggest problem right now is that I can't duplicate the oops
in a controlled environment. It happens once a week across all of our
dozen or so servers running this kernel. I've got a test machine now
running ltp, dbench, kernel compiles, and other processes to try and
duplicate this oops, but haven't seen it in 2 straight days of
testing. It's not clear that the error was seen much (if at all) in
the wild, it looks like Rik fixed the error before lots of people noticed.

From an email exchange with Aleksander, he can't duplicate this
problem in a controlled setting either, it happens about twice per
month for him.

Ideally a VFS person could look at the above patches and just say
"yes, this patch needs to be applied to the FC1 kernel, it could cause
that oops".

Comment 21 Andrew Ryan 2004-08-20 01:04:05 UTC

After further looking the second fix, "do not leave inodes with stale
waitqueue on slab cache" was fixed in the FC1.2199 kernel.

I will attach the patch to FC1.2199 that implements Rik's fix, which
I'm testing now. Note we do not use quotas, so the second part of his
fix is not relevant to us, I don't think.

Comment 22 Andrew Ryan 2004-08-20 01:05:08 UTC

Created attachment 102912 [details]
patch which implement's Rik's inode reclaim pach from 2.4.25

Comment 23 Aleksander Adamowski 2004-08-20 12:59:57 UTC

Unfortunately I cannot test this fix as I've switched to RHEL kernel
on that machine to remedy the panics.

Comment 24 Aleksander Adamowski 2004-08-20 14:45:28 UTC

For the record, we're running the 2.4.21-15.ELsmp RHEL kernel to avoid
the panics.

Comment 25 Andrew Ryan 2004-09-27 17:00:08 UTC

Running with the FC1.2199 kernel that implements Rik's refile_inode
fix, we've had 4 weeks (28 days) of uptime without a crash. The best
we were doing before was 1 week and often less than that.

If anyone is still running/updating the FC1 kernel this would be a
good patch to use/apply...

Comment 26 David Lawrence 2004-09-29 20:26:29 UTC

Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/