Bug 509233

Summary: Kernel will not panic with gfs mount option of debug set.
Product: Red Hat Enterprise Linux 5 Reporter: Toure Dunnon <tdunnon>
Component: gfs-kmodAssignee: Robert Peterson <rpeterso>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: urgent    
Version: 5.2CC: edamato, jkurik, jplans, jruemker, plyons, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-30 16:07:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Toure Dunnon 2009-07-01 20:29:07 UTC
Description of problem:
Kernel panic will not occur when the gfs mount option of debug is set. This is due to the following statement:

diff -up gfs-kernel-0.1.23/src/gfs/mount.c.org gfs-kernel-0.1.23/src/gfs/mount.c
--- gfs-kernel-0.1.23/src/gfs/mount.c.org       2009-07-01 16:31:51.000000000 -0400
+++ gfs-kernel-0.1.23/src/gfs/mount.c   2009-07-01 16:31:57.000000000 -0400        
@@ -123,7 +123,7 @@ gfs_make_args(char *data_arg, struct gfs                       
                        args->ar_oopses_ok = TRUE;                                 
                                                                                   
                else if (!strcmp(x, "debug")) {                                    
-                       args->ar_oopses_ok = TRUE;                                 
+                       args->ar_oopses_ok = FALSE;                                
                        args->ar_debug = TRUE;                                     
                                                                                   
                } else if (!strcmp(x, "upgrade"))

Version-Release number of selected component (if applicable):
RHEL5.2 , 5.3

How reproducible:
everytime

Steps to Reproduce:
1. set the kernel sysctl options kernel.panic = 1 and kernel.sysrq = 1
2. mount -t gfs -o debug /dev/blah /mnt/point 
3. cman_tool kill -n node_foo
4. nothing happens
  
Actual results:
looking through the logs the filesystem will withdraw but the kernel will
not panic.

Expected results:
kernel panic and reboot the system.

Additional info:

Comment 2 Robert Peterson 2009-07-02 13:43:11 UTC
This seems related to bug #488499, in which the complaint is
that gfs is panicking with -o debug, which hampers debugging.
Here in bug #509233, the complaint is that gfs is NOT panicking with
-o debug, which hampers debugging.  Can I get a copy of the last
several console messages from the recreated problem here?  Maybe
this is a simple case of adding a call to panic() for this bug and
removing a call to panic() for the other bug.  Maybe I should close
this one as a duplicate of that one.

Comment 5 Robert Peterson 2009-07-30 16:07:19 UTC
Some people prefer the system to panic with -o debug and others
prefer it not to panic.  Personally, I think -o debug is a
misleading mount option.  It does not provide the user with any
more debug information, unless GFS withdraws or encounters a
assertion error.

Right now, -o debug implies that the system should not panic if
an assert occurs.  I need to clearly understand why the customer
doesn't want that behavior and why they're using -o debug.  If
they want any GFS assert errors to cause a kernel panic, why don't
they just mount without -o debug?

As Dave T. pointed out in bug #488499, there should really be two
mount options with two distinct meanings.  They would cover how
GFS handles the two different kinds of GFS errors: (1) file system
withdraw due to file system inconsistency and (2) assertion errors
caused by run-time problems.  If I were to assign more accurate
names, they would be something like:
(1) -o bug_on_withdraw and (2) -o panic_on_assert.
Since customers are already using -o debug, we can't get rid of it,
so we should assign it to either (1) or (2) above.  I vote for (1).

Since bug #488499 is open to rework this area of the GFS kernel,
I'm going to close this bug as a duplicate of that one.  If the
customer wants a hotfix, that request should be attached to that
bug record.  Also, -o debug was meant primarily for our testing
group.  So perhaps the customer should also clearly lay out why
they are using the -o debug option, what they're trying to avoid
or accomplish by using it, and how we can best make that easier
for them.

*** This bug has been marked as a duplicate of bug 488499 ***