Bug 244242

Summary:	Kernel oops resulting in segfault
Product:	[Fedora] Fedora	Reporter:	Christopher Beland <beland>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Brian Brock <bbrock>
Severity:	low	Docs Contact:
Priority:	low
Version:	6	CC:	esandeen, jonstanley
Target Milestone:	---
Target Release:	---
Hardware:	i686
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2008-02-08 04:25:05 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	427887

Description Christopher Beland 2007-06-14 17:42:15 UTC

I was running a Perl script that hasn't given me any trouble before or since. It
segfaulted and I got the following kernel oops message.

kernel-2.6.20-1.2952.fc6

---

Oops: 0002 [#1]
SMP 
CPU:    0
EIP:    0060:[<ee86a321>]    Not tainted VLI
EFLAGS: 00210202   (2.6.20-1.2952.fc6 #1)
EIP is at journal_grab_journal_head+0x26/0x3e [jbd]
eax: 33206698   ebx: c0007be0   ecx: 00000000   edx: c0007be0
esi: 000280d2   edi: c1427db8   ebp: c0704c80   esp: c48bad28
ds: 007b   es: 007b   ss: 0068
Process perl (pid: 3695, ti=c48ba000 task=d1ecb330 task.ti=c48ba000)
Stack: ee865ef7 00000027 00200246 e8c25898 edd45614 c1215890 ed79c000 c0007be0 
       ee92b502 000280d2 00000001 c0704c80 c0458428 c1427db8 ca748ed4 c045d9ba 
       c48bae4c 00000144 c48baea0 00000009 00000000 00000000 00000020 00000005 
Call Trace:
 [<ee865ef7>] journal_try_to_free_buffers+0x5e/0x13e [jbd]
 [<ee92b502>] ext3_releasepage+0x0/0x7b [ext3]
 [<c0458428>] try_to_release_page+0x30/0x42
 [<c045d9ba>] shrink_inactive_list+0x44f/0x6c9
 [<c045d0b9>] isolate_lru_pages+0x64/0x7d
 [<c045d406>] shrink_active_list+0x334/0x33c
 [<c045dcf1>] shrink_zone+0xbd/0xe2
 [<c045e632>] try_to_free_pages+0x140/0x22e
 [<c045a93c>] __alloc_pages+0x1a8/0x2aa
 [<c04ed250>] copy_from_user+0x3a/0x66
 [<c04614f7>] __handle_mm_fault+0x3e2/0x8ba
 [<c05b7bab>] sys_setsockopt+0x6d/0xa7
 [<c0621eda>] do_page_fault+0x216/0x4da
 [<c0621cc4>] do_page_fault+0x0/0x4da
 [<c062092c>] error_code+0x7c/0x84
 =======================
Code: 1c 5b 5e 5f c3 89 c2 eb 0b f3 90 8b 02 a9 00 00 20 00 75 f5 90 0f ba 2a 15
19 c0 85 c0 75 ec 8b 02 31 c9 f6 c4 40 74 06 8b 4a 24 <ff> 41 04 8b 02 a9 00 00
20 00 75 04 0f 0b eb fe 90 0f ba 32 15 
EIP: [<ee86a321>] journal_grab_journal_head+0x26/0x3e [jbd] SS:ESP 0068:c48bad28

Comment 1 Chuck Ebbert 2007-06-14 18:10:55 UTC

the first few lines of the oops message are missing

Comment 2 Christopher Beland 2007-06-14 18:28:51 UTC

The above was all that was printed on my terminals, but the below was also in
/var/log/messages:

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000004
 printing eip:
ee86a321
*pde = 24312067

Comment 3 Chuck Ebbert 2007-06-14 19:08:17 UTC

struct journal_head *journal_grab_journal_head(struct buffer_head *bh)
{
        struct journal_head *jh = NULL;

        jbd_lock_bh_journal_head(bh);
        if (buffer_jbd(bh)) {
                jh = bh2jh(bh);
 jh==0 =====>   jh->b_jcount++;
        }
        jbd_unlock_bh_journal_head(bh);
        return jh;
}


Please run fsck on the filesystem.

Comment 4 Christopher Beland 2007-06-14 19:27:40 UTC

As it happens, I just finished doing that, and there were some problems which
have now been fixed.

Comment 5 Chuck Ebbert 2007-06-20 16:13:37 UTC

So, an oops caused by corrupt ext3 filesystem, apparently.
cc: added for esandeen...

Comment 6 Eric Sandeen 2007-06-20 16:25:46 UTC

do you still have the output from e2fsck?  Some indication of what was wrong
would be helpful.

Comment 7 Eric Sandeen 2007-06-20 16:31:52 UTC

Guess I should look at locking around when we set/clear buffer_jbd and when
bh_private is set/cleared... 

        if (buffer_jbd(bh)) {
                jh = bh2jh(bh);
 jh==0 =====>   jh->b_jcount++;

if buffer_jbd() is true then BH_JBD is set and bh_private should be set as well,
which is what bh2jh uses... hmm.

Comment 8 Christopher Beland 2007-06-20 16:51:48 UTC

Sorry, I don't have e2fsck output because I ran it in single-user mode with the
filesystem (which is the root partition) unmounted.

Comment 9 Jon Stanley 2008-01-08 01:52:15 UTC

(This is a mass-update to all current FC6 kernel bugs in NEW state)

Hello,

I'm reviewing this bug list as part of the kernel bug triage project, an attempt
to isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!

Comment 10 Christopher Beland 2008-01-08 02:20:49 UTC

Unfortunately, no, I don't have the fsck output.

Comment 11 Jon Stanley 2008-02-08 04:25:05 UTC

Per the previous comment in this bug, I am closing it as INSUFFICIENT_DATA,
since no information has been lodged for over 30 days.

Please re-open this bug or file a new one if you can provide the requested data,
and thanks for filing the original report!