Description of problem: Set up a 2-node cluster and create and mount 100 filesystems on each. Run 'cat /proc/cluster/services' on one of the nodes and you get these messages in the log file (the command also randomly segfaults) proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! And the output of the command only displays the first 50 of the 100 DLM Lock Spaces you have instantiated. Ran the cat command a few more times just to be a bastard and the kernel Oopsed: proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! proc_file_read: Apparent buffer overflow! Unable to handle kernel paging request at virtual address 31347366 printing eip: e030b6b7 *pde = 00000000 Oops: 0000 [#1] Modules linked in: loop gnbd lock_gulm lock_nolock lock_dlm dlm cman gfs lock_harness ipv6 parport_pc lp parport autofs4 sunrpc e1000 floppy sg microcode dm_mod uhci_hcd ehci_hcd button battery asus_acpi ac ext3 jbd qla2300 qla2xxx scsi_transport_fc sd_mod scsi_mod CPU: 0 EIP: 0060:[<e030b6b7>] Not tainted EFLAGS: 00010282 (2.6.7) EIP is at search_bucket+0x17/0x70 [dlm] eax: df2a6000 ebx: 31347366 ecx: 00000018 edx: 0000017e esi: df765738 edi: 00000002 ebp: 00000018 esp: db25fdb0 ds: 007b es: 007b ss: 0068 Process dlm_recvd (pid: 4509, threadinfo=db25e000 task=dac90b30) Stack: e031cbc3 00000000 d4070d7d 00000002 df765738 00000002 00000018 e030b742 0000017e 00000246 35736667 206e7520 d4070d7d 00000002 d4070d08 df765738 00000000 e031b08a 00000018 00000246 20202020 0a226131 ff000000 d4070d08 Call Trace: [<e030b742>] dlm_dir_remove+0x32/0xf0 [dlm] [<e031b08a>] _release_rsb+0x11a/0x2a0 [dlm] [<e030d2a1>] dlm_unlock_stage2+0xd1/0x1a0 [dlm] [<e030f8d3>] process_cluster_request+0x243/0xd30 [dlm] [<c02b0e98>] inet_recvmsg+0x48/0x70 [<c026bd0c>] sock_recvmsg+0xbc/0xc0 [<e0313a53>] midcomms_process_incoming_buffer+0x173/0x250 [dlm] [<c0136af3>] __alloc_pages+0x2f3/0x340 [<c026bd0c>] sock_recvmsg+0xbc/0xc0 [<e0311721>] receive_from_sock+0x141/0x300 [dlm] [<c0117e67>] recalc_task_prio+0x97/0x190 [<e031261b>] process_sockets+0x7b/0xa0 [dlm] [<e031288e>] dlm_recvd+0x9e/0xf0 [dlm] [<e03127f0>] dlm_recvd+0x0/0xf0 [dlm] [<c010429d>] kernel_thread_helper+0x5/0x18 Code: 8b 0b 89 0c 24 0f 18 01 90 8d 04 d0 39 c3 74 23 89 44 24 04 <1>Unable to handle kernel paging request at virtual address 36312038 printing eip: c013b7c2 *pde = 00000000 Oops: 0000 [#2] Modules linked in: loop gnbd lock_gulm lock_nolock lock_dlm dlm cman gfs lock_harness ipv6 parport_pc lp parport autofs4 sunrpc e1000 floppy sg microcode dm_mod uhci_hcd ehci_hcd button battery asus_acpi ac ext3 jbd qla2300 qla2xxx scsi_transport_fc sd_mod scsi_mod CPU: 0 EIP: 0060:[<c013b7c2>] Not tainted EFLAGS: 00010203 (2.6.7) EIP is at put_page+0x2/0x90 eax: 36312038 ebx: 00000001 ecx: df2a7880 edx: 36312038 esi: ddf32c80 edi: 00000000 ebp: df494680 esp: db23ddf8 ds: 007b es: 007b ss: 0068 Process cman_comms (pid: 4412, threadinfo=db23c000 task=dac905b0) Stack: c026f39f ddf32c80 db23df6c c026f3c8 00000000 c026f483 ddf32c80 ddf32c80 db23df6c ddf32c80 ddf32c80 c02aa1ca 00000018 db23de48 00000018 ddf32c90 db23df4c df4947ac 00000018 00000040 c0334140 db23df6c db23df6c c02b0e98 Call Trace: [<c026f39f>] skb_release_data+0x6f/0x90 [<c026f3c8>] kfree_skbmem+0x8/0x20 [<c026f483>] __kfree_skb+0xa3/0x140 [<c02aa1ca>] udp_recvmsg+0x20a/0x290 [<c02b0e98>] inet_recvmsg+0x48/0x70 [<c026bd0c>] sock_recvmsg+0xbc/0xc0 [<c0118897>] __wake_up_common+0x37/0x70 [<c026ed5c>] sock_def_readable+0x5c/0x60 [<e02ea9dd>] send_to_userport+0x1d/0x520 [cman] [<e02ea155>] receive_message+0x85/0xf0 [cman] [<e02ea319>] cluster_kthread+0x159/0x2d0 [cman] [<c0105c12>] ret_from_fork+0x6/0x14 [<c0118850>] default_wake_function+0x0/0x10 [<e02ea1c0>] cluster_kthread+0x0/0x2d0 [cman] [<c010429d>] kernel_thread_helper+0x5/0x18 Code: 8b 00 a9 00 00 08 00 75 47 8b 02 f6 c4 08 75 2e 8b 02 89 d1 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Now uses seq_file for /proc/cluster/services. Checking in proc.c; /cvs/cluster/cluster/cman-kernel/src/proc.c,v <-- proc.c new revision: 1.2; previous revision: 1.1 done Checking in sm_misc.c; /cvs/cluster/cluster/cman-kernel/src/sm_misc.c,v <-- sm_misc.c new revision: 1.2; previous revision: 1.1 done
fix verified.
Updating version to the right level in the defects. Sorry for the storm.