Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 2.1 product line. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 113080

Summary:	panic in kmem_cache_free_one
Product:	Red Hat Enterprise Linux 2.1	Reporter:	anand suvernkar <suvernkar>
Component:	kernel	Assignee:	Arjan van de Ven <arjanv>
Status:	CLOSED DUPLICATE	QA Contact:	Brian Brock <bbrock>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	2.1	CC:	petrides, riel, tao
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2006-02-21 19:00:39 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description anand suvernkar 2004-01-08 09:16:52 UTC

Description of problem:
My RHEL 3.0 machine paniced quite a lot times in the
kmem_cache_free_one function. The statck trace is
Pid/TGid: 0/0, comm:              swapper
EIP: 0060:[<c014febd>] CPU: 1
EIP is at kmem_cache_free_one [kernel] 0x3d (2.4.21-4.ELsmp)
 ESP: 009e:00000086 EFLAGS: 00010086    Tainted: P
EAX: 00000000 EBX: c37c5450 ECX: f512b000 EDX: 00000000
ESI: 01ea257c EDI: c37c5450 EBP: c376f000 DS: 0068 ES: 0068 FS: 0000
GS: 0000
CR0: 8005003b CR2: fcbb4608 CR3: 00101000 CR4: 000006f0
Call Trace:   [<c014f1f2>] free_block [kernel] 0x32 (0xc37c7cec)
[<c014ff9f>] __kmem_cache_free [kernel] 0x6f (0xc37c7d04)
[<c014f2d6>] kfree [kernel] 0x36 (0xc37c7d24)


Version-Release number of selected component (if applicable):

2.4.21-4.ELsmp kernel
How reproducible:
Very less frequently

Steps to Reproduce:
1.Copy the kcore image from proc to some other directory
2.and keep on repeating above step
3.
  
Actual results:

panic
Expected results:

Normal op
Additional info:

Comment 1 Arjan van de Ven 2004-01-08 09:18:11 UTC

which modules are you using ?
(you cut that part off the backtrace)

Comment 2 anand suvernkar 2004-01-08 10:43:00 UTC

parport_pc lp parport netconsole autofs e100 floppy microcode 
my_module  keybdev mousedev hid input usb-ohci usbcore ext3 jbd
aic7xxx sd_mod scsi_mod

These are the modules I was using . my_module is my module.

The complete stack trace is

Pid/TGid: 0/0, comm:              swapper
EIP: 0060:[<c014febd>] CPU: 1
EIP is at kmem_cache_free_one [kernel] 0x3d (2.4.21-4.ELsmp)
 ESP: 009e:00000086 EFLAGS: 00010086    Tainted: P
EAX: 00000000 EBX: c37c5450 ECX: f512b000 EDX: 00000000
ESI: 01ea257c EDI: c37c5450 EBP: c376f000 DS: 0068 ES: 0068 FS: 0000
GS: 0000
CR0: 8005003b CR2: fcbb4608 CR3: 00101000 CR4: 000006f0
Call Trace:   [<c014f1f2>] free_block [kernel] 0x32 (0xc37c7cec)
[<c014ff9f>] __kmem_cache_free [kernel] 0x6f (0xc37c7d04)
[<c014f2d6>] kfree [kernel] 0x36 (0xc37c7d24)
[<f89b601a>] my_func0 [my_module] 0x5a (0xc37c7d4c)
[<f8af13ef>] my_func1 [my_module] 0xcf (0xc37c7d60)
[<c010d879>] handle_IRQ_event [kernel] 0x69 (0xc37c7d8c)
[<f8b5c8f5>] .rodata.str1.1 [my_module] 0x3c1d (0xc37c7da4)
[<f8a04365>] my_func2  [my_module] 0x2f5 (0xc37c7dc8)
[<f8b5c8ff>] .rodata.str1.1 [my_module] 0x3c27 (0xc37c7dd8)
[<f8e202e0>] my_func3  [my_module] 0x0 (0xc37c7de0)
[<f8e202e0>] my_func4  [my_module] 0x0 (0xc37c7dec)
[<f8b51e7b>] my_func5  [my_module] 0x1b (0xc37c7df4)
[<f8b5ddd1>] .rodata.str1.1 [my_module] 0x50f9 (0xc37c7e04)
[<f8a06b5e>] my_func6 [my_module] 0x7e (0xc37c7e34)
[<f89bc6d2>] my_func7 [my_module] 0x742 (0xc37c7e54)
[<f8b5b532>] .rodata.str1.1 [my_module] 0x285a (0xc37c7e64)
[<f8a20d64>] my_func8  [my_module] 0x364 (0xc37c7e8c)
[<f8b5ddd1>] .rodata.str1.1 [my_module] 0x50f9 (0xc37c7e94)
[<f8b5de76>] .rodata.str1.1 [my_module] 0x519e (0xc37c7e9c)
[<f962caa0>] my_func9 [my_module] 0x0 (0xc37c7eac)
[<f8e202e0>] my_func10 [my_module] 0x0 (0xc37c7ef4)
[<f8a209ce>] my_func11 [my_module] 0x8e (0xc37c7f08)
[<f8e202c0>] my_func12 [my_module] 0x0 (0xc37c7f1c)
[<c012ed17>] tasklet_action [kernel] 0x67 (0xc37c7f20)
[<c012eb75>] do_softirq [kernel] 0xd5 (0xc37c7f34)
[<c010db48>] do_IRQ [kernel] 0x148 (0xc37c7f50)
[<c010da00>] do_IRQ [kernel] 0x0 (0xc37c7f74)
[<c0109100>] default_idle [kernel] 0x0 (0xc37c7f7c)
[<c0109100>] default_idle [kernel] 0x0 (0xc37c7f90)
[<c0109129>] default_idle [kernel] 0x29 (0xc37c7fa4)
[<c01091c2>] cpu_idle [kernel] 0x42 (0xc37c7fb0)
[<c0128573>] call_console_drivers [kernel] 0x63 (0xc37c7fc4)
[<c0128893>] printk [kernel] 0x143 (0xc37c7ffc)

I dont know how much relevant it is. But I think I should mention that
this panic occurs perticluarly when my module has got hung. There are
two kernel threads which I create and one is almost cont printing 
mesages using printk ( with an interval of 100 ms). But for some
strange reason the other thread is not able to print anything using
printk. It simply keeps on waiting in the serial_console_write
function. And only when I terminate the first thread, the second
thread succeeds in printing the message. Until then it just hangs.

I dont know whether this problem can somehow cause the panic 
in  kmem_cache_free_one. 

Thanks for the instant reply. I never expected that I can get it so
fast. I am really looking forward to get a solution for this problem

Thanks
Anand

Comment 3 Arjan van de Ven 2004-01-08 10:47:34 UTC

sounds really like a bug in your module........ and not our bug.
Is the source of the module available somewhere ?

Comment 4 anand suvernkar 2004-01-08 11:44:39 UTC

Hi Arjan
   .
   I am really sorry to say that I might not be able to show you the
source code  as it is against the policy of company. 
      So, under this circumstances, I will be really glad  even if you 
simply can point out the possible reason ?

About the hang in my module, the stack trace for the thread hung is

  [<c01c6920>] serial_console_write [kernel] 0x0 (0xedf0ddc4)
[<c01c0739>] serial_in [kernel] 0x19 (0xedf0dde8)
[<c01c69a0>] serial_console_write [kernel] 0x80 (0xedf0ddf8)
[<c0128473>] __call_console_drivers [kernel] 0x63 (0xedf0de20)
[<c0128596>] call_console_drivers [kernel] 0x86 (0xedf0de3c)
[<c0128975>] release_console_sem [kernel] 0x55 (0xedf0de60)
[<c0128893>] printk [kernel] 0x143 (0xedf0de74)

The thread simply keep waiting in serial_console_write while the other
thread keeps on printing messages.
The job of these two threads is to receive data from the network.

If one thinks logically there is very remote possibility of any
relation between these two issues. But panics happen most of the times
only when there is hang in my module.

If you think that they are not related, can you explain the possible
reason behind the second hang. This I think has nothing to do with
my module. Seems like pure race between two printk's which are
printing on the serial console .

Thanks again for the invaluable help
Anand

Comment 5 Arjan van de Ven 2004-01-08 11:48:34 UTC

I'm sorry if gave the impression that I give free debugging help for
proprietary kernel modules; I do not.

*** This bug has been marked as a duplicate of 78616 ***

Comment 6 Red Hat Bugzilla 2006-02-21 19:00:39 UTC

Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.