Bug 119457
Summary: | panics in generic_aio_complete_rw and unmap_kvec after __iodesc_free calls generic_aio_complete_read() | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Chuck Berg <cberg> | ||||||
Component: | kernel | Assignee: | Jeff Moyer <jmoyer> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | djuran, ggrignou, petrides, riel, tao, tcallawa, vanhoof | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHSA-2006-0437 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2006-07-20 13:14:10 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 181405, 186960 | ||||||||
Attachments: |
|
Description
Chuck Berg
2004-03-30 17:57:43 UTC
Created attachment 98988 [details]
netdump log of kernel panic
A complete panic message this time.
Great. Do you have the netdump image, as well? Could you compress it and attach it if you do? Unfortunately, I didn't get the netdump image - the netdump server didn't have enough disk space. I assumed http://www.redhat.com/support/wpapers/redhat/netdump/setup.html was current documentation and only a 4GB dump would be saved. I'm banging on the machine some more, I'll hopefully get the netdump image the next time it crashes. Ok, we need to update the documentation. That bug was fixed some time ago. Of course, I'm not sure how well the dump will compress, so it may be a moot point. ;-) On a side note, could you open up bugs on the shortcomings of the qla2300 driver that ships with RHEL 3? We are updating the driver, and we would like to ensure that we get the fixes that you need. Thanks! I got a crash dump. It compresses to over 2GB, I probably shouldn't create an attachment that large (anyway Mozilla won't upload a file larger than 2GB). You can get it from: ftp://rhcrash:rhcrash@nitftp.knight-sec.com/goldinst-vmcore-20040401.gz and the panic message is: ftp://rhcrash:rhcrash@nitftp.knight-sec.com/goldinst-panic-20040401.txt I only have this machine until Monday, it's an eval unit from HP. (just so you know in case you need me to test a patch or collect more information). I opened a case on that ql2300 driver.
> Storage is a dual HP cciss card, with many 1 and 2GB LVM volumes
> (striped between the RAID volume on each controller).
I read this as you had configured each cciss separately, then used the
Linux LVM to stripe your volumes across the two cciss devices. Is
that correct?
From the dump, it looks like the errant process is accessing
/dev/hpdisk/s1g09 (i.e. opened a block device, not an lvm partition).
Does this make sense given your current configuration?
Has the box gone back to HP yet? Any chance you can keep it longer?
Thanks!
Jeff
That's correct about the volume layout. hpdisk is an LVM volume group. It contains physical volumes /dev/cciss/c1d0 and /dev/cciss/c2d0. /dev/hpdisk/s1g09 is a 1GB LVM volume. (striped between those two physical volumes) Sybase would have opened this particular volume through /dev/raw/raw9. I have the machine until tomorrow (Tuesday) morning now. I doubt I can keep it longer but I'll see. If you still have the machine, let me know, and we'll turn on the slab debugging. If not, I'll continue to try to reproduce this in-house. Thanks. I no longer have this machine, nor access to any similarly large x86 machine. Thanks for your help. My name is Gwendal Grignou, I am working for Silverback Systems where I wrote driver for our new iSCSI chip, iSNAP 2110. Our test is 2 RHEL machines, one running as SCSI initiator, the other one as a target. I open 20 iSCSI sessions (which map to 20 TCP Connections and 20 SCSI host driver interfaces). We use iometer to pmup traffic, throuch a patched version of dynamo, iometer back end thread that generate traffic. After few seconds, the system crashes: with kgdb, it crashes in: generic_aio_complete_rw at aio.c:1135 The problem is iocb->ctx is NULL while the code try to access iocb->ctx->dead (gdb) p _iocb $3 = (void *) 0xd12f0700 (gdb) p *(struct kiocb*)$3 $4 = {list = {next = 0xd12f3480, prev = 0xd12f2c00}, filp = 0x0, ctx = 0x0, user_obj = 0x0, user_data = 3076314848, pos = 387072, buf = 0, nr_transferred = 2048, size = 2048, this_size = 2048, key = 1002, cancel = 0, data = 0x0, users = 0, u = {tq = {list = {next = 0x0, prev = 0x0}, sync = 0, routine = 0, data = 0x0}, list = {next = 0x0, prev = 0x0}}, rlim_fsize = 4294967295} It is easy to reproduce in my configuration. Do you have any idea where I should look to help you debug? Thanks, Gwendal. Hi, Gwendal, As a start, you could enable slab debugging. Let me know what (if anything) that uncovers. -Jeff Reposting prior update for inclusion in IT entry, 55870: <start> Hi, Gwendal, As a start, you could enable slab debugging. Let me know what (if anything) that uncovers. -Jeff <End> For anyone still running into this problem, please try the debug kernel located here: http://people.redhat.com/jmoyer/.bz119457/ Be sure to capture the kernel logs via netlog/netdump or a serial console. When next the machine panics, I'm interested in seeing these logs. Is there anyone who is still seeing this problem? Created attachment 122301 [details]
Add a missing return statement; without this, an io can be completed twice.
I'm currently testing this patch, and it's holding up quite well. I am going
to coordinate some more stress testing internally, but I'd like to see if the
customer(s) could give this a try.
Thanks.
Here is a description of the problem that the posted patch fixes: If a program issues a read for a page which is undergoing write I/O, you can trigger this problem. For example, a program which iissues a number of writes to a file, followed very closely by a number of reads to the same portions of the same file will trigger this bug. What ends up happening is that the AIO reads will catch up with the writes, since everything is being cached. When the read path reaches a page which is still locked by the writer, it follows a code path that is seldom run. The code path tests to see if the page is locked for I/O. If it is locked for I/O, then the reader puts itself on the page wait queue (entries on the page wait queue get woken up when the page is unlocked). The errant code path then completes the I/O, even though it hasn't completed. Because of this, when the page is unlocked, we get called again with real data in the page, and the same I/O is completed a second time. This second I/O completion then causes the dereferencing of NULL pointer in the io context structure. Because of the racy nature of this problem, it's easier to reproduce over a networked file system than it is on a local file system. That isn't to say that one cannot trigger the bug on a local file system; it's simply more difficult to hit the race in the local case. A fix for this problem has just been committed to the RHEL3 U8 patch pool this evening (in kernel version 2.4.21-40.2.EL). This issue is on Red Hat Engineering's list of planned work items for the upcoming Red Hat Enterprise Linux 3.8 release. Engineering resources have been assigned and barring unforeseen circumstances, Red Hat intends to include this item in the 3.8 release. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0437.html |