Bug 228933

Summary: unable to handle kernel paging request in sunrpc::cache_clean
Product: [Fedora] Fedora Reporter: Pawel Salek <pawsa>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 6CC: jonstanley, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-08 04:27:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 427887    

Description Pawel Salek 2007-02-15 22:29:28 UTC
Description of problem:
Kernel fails after shutting down nfs services and unmounting the shared volume.

Version-Release number of selected component (if applicable):
2.6.19-1.2895.fc6

How reproducible:
I could register the crash data only once but the kernel is very unstable under
load _and_ NFS4 (and then, it hangs completely). Crash as above happened when
only NFS3 exports were active, though. General stability with NFS3 is otherwise
rather satisfactory. 

Steps to Reproduce:
1. service nfs stop
2. umount /exports
3. wait few seconds and you may be (un)lucky. 
  
Actual results:
kernel: BUG: unable to handle kernel paging request at virtual address 1400041c
kernel:  printing eip:
kernel: f0b9d38d
kernel: *pde = 00000000
kernel: Oops: 0000 [#1]
kernel: SMP 
kernel: last sysfs file: /block/dm-1/range
kernel: Modules linked in: nfsd exportfs lockd nfs_acl sunrpc ipv6 ipt_REJECT
xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables
video sbs i2c_ec button battery asus_acpi ac lp floppy sg 3c59x scb2_flash
mtdcore chipreg map_funcs tg3 mii i2c_piix4 i2c_core ide_cd cdrom pcspkr
parport_pc parport serio_raw dm_snapshot dm_zero dm_mirror dm_mod mptspi m
ptscsih mptbase scsi_transport_spi sd_mod scsi_mod raid456 xor ext3 jbd ehci_hcd
ohci_hcd uhci_hcd
kernel: CPU:    1
kernel: EIP:    0060:[<f0b9d38d>]    Not tainted VLI
kernel: EFLAGS: 00010206   (2.6.19-1.2895.fc6 #1)
kernel: EIP is at cache_clean+0xb5/0x194 [sunrpc]
kernel: eax: 00390281   ebx: 14000418   ecx: f0c21c80   edx: f0c21c80
kernel: esi: c17a4800   edi: efd81240   ebp: 00000282   esp: c17d9f58
kernel: ds: 007b   es: 007b   ss: 0068
kernel: Process events/1 (pid: 9, ti=c17d9000 task=eff45630 task.ti=c17d9000)
kernel: Stack: f0bb4980 f0bb4984 f0b9dc88 c04368c7 00000282 efd81240 efd81260
f0b9dc83 
kernel:        00000000 efd81260 efd81240 efd81258 00000000 c0437284 00000001
00000000 
kernel:        00000001 00010000 00000000 00000000 eff45630 c04215c7 00100100
00200200 
kernel: Call Trace:
kernel:  [<f0b9dc88>] do_cache_clean+0x5/0x2e [sunrpc]
kernel:  [<c04368c7>] run_workqueue+0x97/0xdd
kernel:  [<c0437284>] worker_thread+0xd9/0x10d
kernel:  [<c0439810>] kthread+0xc0/0xec
kernel:  [<c0404c03>] kernel_thread_helper+0x7/0x10
kernel:  =======================
kernel: Code: 8d f6 00 00 00 8d 41 0c e8 5b 87 a8 cf a1 e4 67 bb f0 8d 34 85 00
00 00 00 a1 e0 67 bb f0 03 70 08
 8b 1e eb 47 8b 15 e0 67 bb f0 <8b> 43 04 39 42 50 7e 04 40 89 42 50 8b 43 04 3b
05 00 80 85 c0 
kernel: EIP: [<f0b9d38d>] cache_clean+0xb5/0x194 [sunrpc] SS:ESP 0068:c17d9f58

This happens on a IBM eserver, 2.4GHz Xeon with HT enabled.

Comment 1 Pawel Salek 2007-02-19 09:01:03 UTC
This is still in 2.6.19-1.2911.fc6. This time, it came entirely unprovoked in
the middle of night!
BUG: unable to handle kernel paging request at virtual address 51c503f6
 printing eip:
f0b973d9
*pde = 00000000
Oops: 0000 [#1]
SMP
last sysfs file: /devices/pci0000:00/0000:00:01.0/irq
Modules linked in: nfsd exportfs lockd nfs_acl sunrpc ipv6 ipt_REJECT xt_state
ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables video sbs
i2c_ec button battery asus_acpi ac lp sg scb2_flash floppy mtdcore chipreg
map_funcs 3c59x i2c_piix4 mii pcspkr i2c_core tg3 parport_pc parport serio_raw
ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_mod mptspi mptscsih mptbase
scsi_transport_spi sd_mod scsi_mod raid456 xor ext3 jbd ehci_hcd ohci_hcd uhci_hcd
   1
   0060:[<f0b973d9>]    Not tainted VLI
EFLAGS: 00010202   (2.6.19-1.2911.fc6 #1)
EIP is at cache_clean+0xb5/0x194 [sunrpc]
eax: ffffffff   ebx: 51c503f2   ecx: f0bae840   edx: f0bae840
cce9dc80   edi: efd811c0   ebp: 00000282   esp: c17d9f58
ds: 007b   es: 007b   ss: 0068
Process events/1 (pid: 9, ti=c17d9000 task=eff45630 task.ti=c17d9000)
Stack: f0baea00 f0baea04 f0b97cd4 c043692f 00000282 efd811c0 efd811e0 f0b97ccf
       00000000 efd811e0 efd811c0 efd811d8 00000000 c04372ec 00000001 00000000
       00000001 00010000 00000000 00000000 eff45630 c04215f1 00100100 00200200
Call Trace:
 [<f0b97cd4>] do_cache_clean+0x5/0x2e [sunrpc]
 [<c043692f>] run_workqueue+0x97/0xdd
 [<c04372ec>] worker_thread+0xd9/0x10d
 [<c0439878>] kthread+0xc0/0xec
 [<c0404c03>] kernel_thread_helper+0x7/0x10
 =======================
Code: 8d f6 00 00 00 8d 41 0c e8 37 e5 a8 cf a1 64 08 bb f0 8d 34 85 00 00 00 00
a1 60 08 bb f0 03 70 08 8b 1e eb 47 8b 15 60 08 bb f0 <8b> 43 04 39 42 50 7e 04
40 89 42 50 8b 43 04 3b 05 00 60 85 c0
[<f0b973d9>] cache_clean+0xb5/0x194 [sunrpc] SS:ESP 0068:c17d9f58

What can I do to help debugging it?

Comment 2 Pawel Salek 2007-02-19 18:24:46 UTC
An apparently related report:
http://lkml.org/lkml/2007/1/15/57

Comment 3 Pawel Salek 2007-02-19 18:26:31 UTC
An apparently related report:
http://lkml.org/lkml/2007/1/15/57
When I think about it, my exported ext3 file system was mounted with "acl"
option - does it matter?

Comment 4 Pawel Salek 2007-08-16 09:56:26 UTC
FWIW, I haven't seen that one recently with 2.6.20-1.2962.fc6.

Comment 5 Jon Stanley 2008-01-08 01:49:24 UTC
(This is a mass-update to all current FC6 kernel bugs in NEW state)

Hello,

I'm reviewing this bug list as part of the kernel bug triage project, an attempt
to isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!

Comment 6 Jon Stanley 2008-02-08 04:27:24 UTC
Per the previous comment in this bug, I am closing it as INSUFFICIENT_DATA,
since no information has been lodged for over 30 days.

Please re-open this bug or file a new one if you can provide the requested data,
and thanks for filing the original report!