Bug 605624

Summary: kdump errors on startup when disabled
Product: Red Hat Enterprise Linux 6 Reporter: Bastien Nocera <bnocera>
Component: kexec-toolsAssignee: Cong Wang <amwang>
Status: CLOSED CURRENTRELEASE QA Contact: Chao Ye <cye>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: cye, nhorman, qcai, rkhan, syeghiay
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kexec-tools-2_0_0-119_el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-11 14:46:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed patch none

Description Bastien Nocera 2010-06-18 12:34:25 UTC
kexec-tools-2.0.0-75.el6.i686

I get the "boot messages" warning in GNOME, straight after first boot because kdump fails. kdump was actually disabled during firstboot. kdump should error when disabled.

No kdump initial ramdisk found.	[WARNING]
Rebuilding /boot/initrd-2.6.32-33.el6.i686kdump.img
Starting kdump:	[FAILED]

Comment 2 RHEL Program Management 2010-06-18 13:03:24 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 3 Cong Wang 2010-06-23 02:47:14 UTC
Did you check /var/log/messages?

Comment 4 Bastien Nocera 2010-06-24 08:14:43 UTC
(In reply to comment #3)
> Did you check /var/log/messages?    

What for exactly? This is what happens on the very first boot after installing RHEL6. You wait a long while for kdump to create its new ramdisk before firstboot asks us if we want to use it.

Comment 5 Cong Wang 2010-06-24 08:30:30 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Did you check /var/log/messages?    
> 
> What for exactly? This is what happens on the very first boot after installing
> RHEL6. You wait a long while for kdump to create its new ramdisk before
> firstboot asks us if we want to use it.    

If you mean there is no error, incorrect, there are errror messages in /var/log/messages.

If you mean the errors should come earlier, before creating a new ramdisk, yeah, this makes sense for me.

Comment 6 Bastien Nocera 2010-06-24 10:04:07 UTC
No, it should be disabled by default, and not make my bootup much slower than it ought to be, especially on *the very first boot*.

Comment 7 Neil Horman 2010-06-24 11:34:16 UTC
No bastien, it shouldn't be disabled.  One of the requirements for RHEL6 was that kdump be available as early as possible.  The right fix here, as amerigo is trying to do is fix whatever error you are encountering.  So please tell him what your error messages are.

Comment 8 Bastien Nocera 2010-06-24 13:59:22 UTC
(In reply to comment #7)
> No bastien, it shouldn't be disabled.  One of the requirements for RHEL6 was
> that kdump be available as early as possible.

Why is it asking me whether to enable it in firstboot then?

> The right fix here, as amerigo
> is trying to do is fix whatever error you are encountering.  So please tell him
> what your error messages are.    

638 Jun 17 13:46:44 snoogens kdump: No crashkernel parameter specified for running kernel•
  639 Jun 17 13:46:46 snoogens kdump: failed to start up•

That's on first boot of a clean install of RHEL6 Workstation. The 2 odd minutes it takes to generate the unusable kdump ramdisk make the distro look like nobody actually used it.

Comment 9 Cong Wang 2010-06-29 10:49:21 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > No bastien, it shouldn't be disabled.  One of the requirements for RHEL6 was
> > that kdump be available as early as possible.
> 
> Why is it asking me whether to enable it in firstboot then?
> 
> > The right fix here, as amerigo
> > is trying to do is fix whatever error you are encountering.  So please tell him
> > what your error messages are.    
> 
> 638 Jun 17 13:46:44 snoogens kdump: No crashkernel parameter specified for
> running kernel•
>   639 Jun 17 13:46:46 snoogens kdump: failed to start up•


This is exactly what I meant in comment #3.

> 
> That's on first boot of a clean install of RHEL6 Workstation. The 2 odd minutes
> it takes to generate the unusable kdump ramdisk make the distro look like
> nobody actually used it.    

Sorry, I still don't quite understand your problem. It seems you have problems with firstboot, rather than mkdumprd?

Comment 10 Bastien Nocera 2010-06-30 09:46:12 UTC
1) Install a recent RHEL6
2) Boot it up the first time
3) Wonder why it takes so long to boot (that's because the kdump script inits a ramdisk that it won't be able to use)
4) Get to firstboot, disable kdump, login
5) Wonder why there are boot errors reported (because kdump couldn't start)

You need to fix the errors in 3) and 5) so that the experience is decent the first time RHEL6 is booted.

Comment 11 Bastien Nocera 2010-06-30 09:47:30 UTC
Adding blockers, as it looks like this is unlikely to be fixed today.

Comment 12 Cong Wang 2010-07-01 03:02:11 UTC
(In reply to comment #10)
> 1) Install a recent RHEL6
> 2) Boot it up the first time
> 3) Wonder why it takes so long to boot (that's because the kdump script inits a
> ramdisk that it won't be able to use)
> 4) Get to firstboot, disable kdump, login
> 5) Wonder why there are boot errors reported (because kdump couldn't start)
> 
> You need to fix the errors in 3) and 5) so that the experience is decent the
> first time RHEL6 is booted.    

So, if kdump can't be start, letting it fail sooner, i.e. before building initrd, can satisfy you?

Comment 13 Cong Wang 2010-07-01 06:37:05 UTC
Created attachment 428178 [details]
Proposed patch

Please check if this patch works for you. In fact, I did test it, it works for me.

Note that, the kernel has a bug in /sys/kernel/kexec_crash_size, which will display 1 instead of 0 when there is not memory reserved, this is fixed by:
http://post-office.corp.redhat.com/archives/rhkernel-list/2010-June/msg01601.html

Make sure you have that patch applied.

Comment 15 Chao Ye 2010-08-19 10:11:44 UTC
Hello Bastien,

Could you verify this bug for us with the latest kexec-tools. Cause it's RC blocker now.

Thanks.

Comment 16 Chao Ye 2010-08-26 03:30:50 UTC
Reproduced with RHEL6.0-20100622.1_nfs-Workstation-i386:
===============================================================================
[root@hp-dl360g5-01 ~]# rpm -q kernel kexec-tools
kernel-2.6.32-37.el6.i686
kexec-tools-2.0.0-82.el6.i686
-------------------------------------------------------------------------------
First boot, got error message from /var/log/messages:
Aug 25 22:42:40 hp-dl360g5-01 kdump: No crashkernel parameter specified for running kernel
Aug 25 22:42:41 hp-dl360g5-01 kdump: failed to start up
-------------------------------------------------------------------------------
Disable kdump from firstboot, then login, got error message again from /var/log/messages:
Aug 25 22:50:35 hp-dl360g5-01 kdump: No crashkernel parameter specified for running kernel
Aug 25 22:50:35 hp-dl360g5-01 kdump: failed to start up

Verified with RHEL6.0-20100825.n.0_nfs-Workstation-i386
===============================================================================
[root@hp-dl360g5-01 ~]# rpm -q kernel kexec-tools
kernel-2.6.32-67.el6.i686
kexec-tools-2.0.0-143.el6.i686
-------------------------------------------------------------------------------
Aug 25 23:16:39 hp-dl360g5-01 kdump: No crashkernel parameter specified for running kernel
Aug 25 23:16:40 hp-dl360g5-01 dbus: avc:  netlink poll: error 4
-------------------------------------------------------------------------------
Disable kdump from firstboot, then login. No error message about kdump could be found from /var/log/messages.

Change status to VERIFIED.

Comment 17 releng-rhel@redhat.com 2010-11-11 14:46:13 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.