Bug 477635

Summary: If diskdump fails, panic information should be displayed.
Product: Red Hat Enterprise Linux 4 Reporter: Takao Indoh <tindoh>
Component: kernelAssignee: Takao Indoh <tindoh>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 4.8CC: anderson, jtluka, lwang, tachibana
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 19:25:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to add reboot_on_err(kernel)
none
Patch to add reboot_on_err(diskdumputils)
none
A patch to support halt_on_error option none

Description Takao Indoh 2008-12-22 16:23:10 UTC
Description of problem:
If all the following conditions are met, the system is rebooted.
- kernel.panic in /etc/sysctl.conf is set (for example, 10 seconds).
- fallback_on_err is 1.
- Both diskdump and netdump fail(or diskdump fails and netdump is not enabled).

Then many users who don't use a serial console cannot see information 
about the cause of the panic. Therefore, if diskdump fails, system should be halted with panic information(back trace, etc).

Version-Release number of selected component (if applicable):
kernel-2.6.9-78.EL

How reproducible:
Always

Steps to Reproduce:
1. service diskdump initialformat; service diskdump start
2. dd if=/dev/zero of=/dev/sdb1 bs=1024 count=10; # i know this is bad, i'm
simulating hardware failure
3. echo 5 > /proc/sys/kernel/panic
4. echo c > /proc/sysrq-trigger
  
Actual results:
Diskdump fails and system hangs up.
NOTE:
System hangs up because of bz248666. If bz248666 is fix, system doesn't hang up. The system is rebooted after diskdump fails.

Expected results:
Diskdump fails and system halts with panic information.

Additional info:
Originally this problem was discussed on bz248666. In bz248666, the following problem was discussed.
- System hangs up after diskdump fails.
- Panic information should be displayed if diskdump fails.
These problems is separatable, so this bz is newly opened for the latter problem.

Comment 1 Takao Indoh 2008-12-22 16:24:53 UTC
Created attachment 327667 [details]
Patch to add reboot_on_err(kernel)

Comment 2 Takao Indoh 2008-12-22 16:25:30 UTC
Created attachment 327668 [details]
Patch to add reboot_on_err(diskdumputils)

Comment 3 RHEL Program Management 2009-01-05 21:29:50 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 Dave Anderson 2009-01-08 17:05:35 UTC
(In reply to comment #2)
> Created an attachment (id=327668) [details]
> Patch to add reboot_on_err(diskdumputils)

If you want to add this patch to the existing RHEL4.8 diskdumputils errata:

  RHEA-2009:8209-01 - diskdumputils documentation update
  http://errata.devel.redhat.com/errata/info/8001

Then a separate Bugzilla with the "diskdumputils" component must
be created.  This BZ is for the "kernel" component only.
  
Then to get the diskdumputils change checked in, the new bugzilla
must adhere to the following:
 	
  http://intranet.corp.redhat.com/ic/intranet/RHEL4CheckinPolicy.html
  RHEL4 CVS Check-in Policy

  As of 11pm EDT on Tuesday March 13, 2007, commits to the RHEL-4 branch
  need to contain at least one Resolves:, Related:, or Reverts: line,
  containing at least one Bugzilla ID or CVE ID in a supported format,
  in the log message or spec file %changelog.

  Update: As of 4pm EDT on Monday August 25, 2008, checkins on the RHEL-4
  branch must reference one or more Bugzillas with the following flag state
  or the commit will be denied:

  (rhel-4.8 == +) or
  (cluster-4.8 == +) or
  (rhel-4.8 == ? and pm_ack == +) or
  (cluster-4.8 == ? and pm_ack == +)

Comment 6 Takao Indoh 2009-01-08 18:02:50 UTC
> Then a separate Bugzilla with the "diskdumputils" component must
> be created.  This BZ is for the "kernel" component only.

Thanks for the information. Now I am rewriting the patch of diskdumputils. I'll open new bz for it.

Comment 7 Takao Indoh 2009-01-08 22:27:15 UTC
Created attachment 328499 [details]
A patch to support halt_on_error option

I upload the latest patch. This patch has been already posted to review.

Comment 8 Takao Indoh 2009-01-08 22:41:56 UTC
> > Then a separate Bugzilla with the "diskdumputils" component must
> > be created.  This BZ is for the "kernel" component only.
> Thanks for the information. Now I am rewriting the patch of diskdumputils. I'll
> open new bz for it.

I opened bz479337.

Comment 9 Vivek Goyal 2009-01-14 14:23:25 UTC
Committed in 78.28.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 12 errata-xmlrpc 2009-05-18 19:25:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html