129510 – cat /proc/cluster/services with 100 filesystems: kernel Oops

Bug 129510 - cat /proc/cluster/services with 100 filesystems: kernel Oops

Summary: cat /proc/cluster/services with 100 filesystems: kernel Oops

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	gfs
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Christine Caulfield
QA Contact:	GFS Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-08-09 20:30 UTC by Derek Anderson
Modified:	2015-12-17 16:17 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-08-30 19:23:52 UTC
Embargoed:

Attachments	(Terms of Use)

Description Derek Anderson 2004-08-09 20:30:03 UTC

Description of problem:
Set up a 2-node cluster and create and mount 100 filesystems on each.
 Run 'cat /proc/cluster/services' on one of the nodes and you get
these messages in the log file (the command also randomly segfaults)

proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!

And the output of the command only displays the first 50 of the 100
DLM Lock Spaces you have instantiated.

Ran the cat command a few more times just to be a bastard and the
kernel Oopsed:
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
proc_file_read: Apparent buffer overflow!
Unable to handle kernel paging request at virtual address 31347366
 printing eip:
e030b6b7
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: loop gnbd lock_gulm lock_nolock lock_dlm dlm cman
gfs lock_harness ipv6 parport_pc lp parport autofs4 sunrpc e1000
floppy sg microcode dm_mod uhci_hcd ehci_hcd button battery asus_acpi
ac ext3 jbd qla2300 qla2xxx scsi_transport_fc sd_mod scsi_mod
CPU:    0
EIP:    0060:[<e030b6b7>]    Not tainted
EFLAGS: 00010282   (2.6.7)
EIP is at search_bucket+0x17/0x70 [dlm]
eax: df2a6000   ebx: 31347366   ecx: 00000018   edx: 0000017e
esi: df765738   edi: 00000002   ebp: 00000018   esp: db25fdb0
ds: 007b   es: 007b   ss: 0068
Process dlm_recvd (pid: 4509, threadinfo=db25e000 task=dac90b30)
Stack: e031cbc3 00000000 d4070d7d 00000002 df765738 00000002 00000018
e030b742
       0000017e 00000246 35736667 206e7520 d4070d7d 00000002 d4070d08
df765738
       00000000 e031b08a 00000018 00000246 20202020 0a226131 ff000000
d4070d08
Call Trace:
 [<e030b742>] dlm_dir_remove+0x32/0xf0 [dlm]
 [<e031b08a>] _release_rsb+0x11a/0x2a0 [dlm]
 [<e030d2a1>] dlm_unlock_stage2+0xd1/0x1a0 [dlm]
 [<e030f8d3>] process_cluster_request+0x243/0xd30 [dlm]
 [<c02b0e98>] inet_recvmsg+0x48/0x70
 [<c026bd0c>] sock_recvmsg+0xbc/0xc0
 [<e0313a53>] midcomms_process_incoming_buffer+0x173/0x250 [dlm]
 [<c0136af3>] __alloc_pages+0x2f3/0x340
 [<c026bd0c>] sock_recvmsg+0xbc/0xc0
 [<e0311721>] receive_from_sock+0x141/0x300 [dlm]
 [<c0117e67>] recalc_task_prio+0x97/0x190
 [<e031261b>] process_sockets+0x7b/0xa0 [dlm]
 [<e031288e>] dlm_recvd+0x9e/0xf0 [dlm]
 [<e03127f0>] dlm_recvd+0x0/0xf0 [dlm]
 [<c010429d>] kernel_thread_helper+0x5/0x18

Code: 8b 0b 89 0c 24 0f 18 01 90 8d 04 d0 39 c3 74 23 89 44 24 04
<1>Unable to handle kernel paging request at virtual address 36312038
 printing eip:
c013b7c2
*pde = 00000000
Oops: 0000 [#2]
Modules linked in: loop gnbd lock_gulm lock_nolock lock_dlm dlm cman
gfs lock_harness ipv6 parport_pc lp parport autofs4 sunrpc e1000
floppy sg microcode dm_mod uhci_hcd ehci_hcd button battery asus_acpi
ac ext3 jbd qla2300 qla2xxx scsi_transport_fc sd_mod scsi_mod
CPU:    0
EIP:    0060:[<c013b7c2>]    Not tainted
EFLAGS: 00010203   (2.6.7)
EIP is at put_page+0x2/0x90
eax: 36312038   ebx: 00000001   ecx: df2a7880   edx: 36312038
esi: ddf32c80   edi: 00000000   ebp: df494680   esp: db23ddf8
ds: 007b   es: 007b   ss: 0068
Process cman_comms (pid: 4412, threadinfo=db23c000 task=dac905b0)
Stack: c026f39f ddf32c80 db23df6c c026f3c8 00000000 c026f483 ddf32c80
ddf32c80
       db23df6c ddf32c80 ddf32c80 c02aa1ca 00000018 db23de48 00000018
ddf32c90
       db23df4c df4947ac 00000018 00000040 c0334140 db23df6c db23df6c
c02b0e98
Call Trace:
 [<c026f39f>] skb_release_data+0x6f/0x90
 [<c026f3c8>] kfree_skbmem+0x8/0x20
 [<c026f483>] __kfree_skb+0xa3/0x140
 [<c02aa1ca>] udp_recvmsg+0x20a/0x290
 [<c02b0e98>] inet_recvmsg+0x48/0x70
 [<c026bd0c>] sock_recvmsg+0xbc/0xc0
 [<c0118897>] __wake_up_common+0x37/0x70
 [<c026ed5c>] sock_def_readable+0x5c/0x60
 [<e02ea9dd>] send_to_userport+0x1d/0x520 [cman]
 [<e02ea155>] receive_message+0x85/0xf0 [cman]
 [<e02ea319>] cluster_kthread+0x159/0x2d0 [cman]
 [<c0105c12>] ret_from_fork+0x6/0x14
 [<c0118850>] default_wake_function+0x0/0x10
 [<e02ea1c0>] cluster_kthread+0x0/0x2d0 [cman]
 [<c010429d>] kernel_thread_helper+0x5/0x18

Code: 8b 00 a9 00 00 08 00 75 47 8b 02 f6 c4 08 75 2e 8b 02 89 d1

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Christine Caulfield 2004-08-23 12:42:12 UTC

Now uses seq_file for /proc/cluster/services.

Checking in proc.c;
/cvs/cluster/cluster/cman-kernel/src/proc.c,v  <--  proc.c
new revision: 1.2; previous revision: 1.1
done
Checking in sm_misc.c;
/cvs/cluster/cluster/cman-kernel/src/sm_misc.c,v  <--  sm_misc.c
new revision: 1.2; previous revision: 1.1
done

Comment 2 Corey Marthaler 2004-08-30 19:23:52 UTC

fix verified.

Comment 3 Kiersten (Kerri) Anderson 2004-11-16 19:05:27 UTC

Updating version to the right level in the defects.  Sorry for the storm.

Note You need to log in before you can comment on or make changes to this bug.