Bug 157244

Summary: any attempt to change value in cman/max_retries will now cause a panic
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmanAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED NEXTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-20 16:18:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2005-05-09 19:10:56 UTC
Description of problem:
In an attempt to get more info on bz 139738, I have been playing around with
this value, however with the latest build if I try to change it, it will panic
the machnine.

[root@tank-04 ~]# echo "5" > /proc/cluster/config/cman/max_retries
Unable to handle kernel paging request at virtual address f6ded000
 printing eip:
021b4d38
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: gnbd(U) lock_nolock(U) gfs(U) lock_dlm(U) dlm(U) cman(U)
lock_harness(U) md5 ipv6 parport_pc lp parport autofs4 sunrpc button battery ac
uhci_hcd hw_random e1000 floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod
qla2300 qla2xxx scsi_transport_fc sd_mod scsi_mod
CPU:    1
EIP:    0060:[<021b4d38>]    Not tainted VLI
EFLAGS: 00010287   (2.6.9-6.37.ELhugemem)
EIP is at simple_strtoul+0x7c/0xbc
eax: f6ded000   ebx: f6ded000   ecx: 0000000a   edx: 7de00f70
esi: 0000000a   edi: 00000000   ebp: 7de00f70   esp: 7de00f5c
ds: 007b   es: 007b   ss: 0068
Process bash (pid: 2877, threadinfo=7de00000 task=7dc647b0)
Stack: 00000002 7c7b4580 00000002 7de00fac 82a2c2df 7ef6e780 7ef6e780 02181acf
       82a3ca30 023195e0 7c7b4580 0215565f 7de00fac f6ded000 7c7b4580 fffffff7
       f6ded000 7de00000 02155729 7de00fac 00000000 00000000 00000000 7de00fc4
Call Trace:
 [<82a2c2df>] cman_config_write_proc+0x12/0x2a [cman]
 [<02181acf>] proc_file_write+0x23/0x27
 [<0215565f>] vfs_write+0xb6/0xe2
 [<02155729>] sys_write+0x3c/0x62
Code: <3>Debug: sleeping function called from invalid context at
include/linux/rwsem.h:43
in_atomic():0[expected: 0], irqs_disabled():1
 [<0211f405>] __might_sleep+0x7d/0x88
 [<0215054b>] rw_vm+0xdb/0x282
 [<021b4d0d>] simple_strtoul+0x51/0xbc
 [<021b4d0d>] simple_strtoul+0x51/0xbc
 [<021509a5>] get_user_size+0x30/0x57
 [<021b4d0d>] simple_strtoul+0x51/0xbc
 [<0210615b>] show_registers+0x115/0x16c
 [<021062f2>] die+0xdb/0x16b
 [<02121b44>] vprintk+0x136/0x14a
 [<0211a8a2>] do_page_fault+0x421/0x5e7
 [<021b4d38>] simple_strtoul+0x7c/0xbc
 [<0216c4d9>] dnotify_parent+0x1b/0x6e
 [<0216bde0>] notify_change+0x1d0/0x1dc
 [<02153bb1>] do_truncate+0x87/0xbc
 [<0216ae43>] get_new_inode_fast+0x7c/0xc3
 [<02160502>] permission+0x41/0x46
 [<0211a481>] do_page_fault+0x0/0x5e7
 [<021b4d38>] simple_strtoul+0x7c/0xbc
 [<82a2c2df>] cman_config_write_proc+0x12/0x2a [cman]
 [<02181acf>] proc_file_write+0x23/0x27
 [<0215565f>] vfs_write+0xb6/0xe2
 [<02155729>] sys_write+0x3c/0x62
 Bad EIP value.
 <0>Fatal exception: panic in 5 seconds
Kernel panic - not syncing: Fatal exception
                                                                                    

Version-Release number of selected component (if applicable):
[root@tank-01 ~]# rpm -qa | grep cman
cman-1.0-0.pre33.0
cman-kernheaders-2.6.9-33.0
cman-kernel-hugemem-2.6.9-33.0
[root@tank-01 ~]# uname -ar
Linux tank-01.lab.msp.redhat.com 2.6.9-6.37.ELhugemem #1 SMP Tue Mar 29 15:52:10
EST 2005 i686i686 i386 GNU/Linux


How reproducible:
everytime

Comment 1 Christine Caulfield 2005-05-10 08:15:55 UTC
This bug also exists in DLM - /proc/cluster/config/dlm/*
The fix is easy and benign, I've checked it into FC4 branch. waiting approval
from Kevin to check it in elsewhere.

Comment 2 Christine Caulfield 2005-05-10 14:43:33 UTC
Checked in on RHEL4 branch:
Checking in cman-kernel/src/proc.c;
/cvs/cluster/cluster/cman-kernel/src/proc.c,v  <--  proc.c
new revision: 1.11.2.3; previous revision: 1.11.2.2
done
Checking in dlm-kernel/src/config.c;
/cvs/cluster/cluster/dlm-kernel/src/config.c,v  <--  config.c
new revision: 1.6.2.1; previous revision: 1.6
done


Comment 3 Corey Marthaler 2005-05-20 16:18:38 UTC
fix verified.