Bug 221272

Summary: "service kdump start" fails on rawhide
Product: [Fedora] Fedora Reporter: William Cohen <wcohen>
Component: kexec-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: triage
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-07 01:04:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patched version of kexec-tools to account for extra arch args none

Description William Cohen 2007-01-03 14:31:42 UTC
Description of problem:

When attempting to set up a rawhide machine (both x86_64 and i686) using the
steps listed on http://fedoraproject.org/wiki/FC6KdumpKexecHowTo, the "service
kdump start" fails on the machine. The kexec command prints out a list of
available options. It looks like kexec doesn't understand the arguments being
passed to it from /etc/init.d/kdump.



Version-Release number of selected component (if applicable):
# uname -a
Linux trek.devel.redhat.com 2.6.19-1.2891.fc7PAE #1 SMP Thu Dec 21 11:15:27 EST
2006 i686 i686 i386 GNU/Linux
# rpm -q kexec-tools
kexec-tools-1.101-55.fc7

How reproducible:

Everytime


Steps to Reproduce:
1. Install the steps in http://fedoraproject.org/wiki/FC6KdumpKexecHowTo
2. Enter "service kdump start" command

  
Actual results:

Get an the following output from the "service kdump start"
# service kdump start
kexec 1.101 released 15 February 2005
Usage: kexec [OPTION]... [kernel]
Directly reboot into a new kernel

 -h, --help           Print this help.
 -v, --version        Print the version of kexec.
 -f, --force          Force an immediate kexec, don't call shutdown.
 -x, --no-ifdown      Don't bring down network interfaces.
                      (if used, must be last option specified)
 -l, --load           Load the new kernel into the current kernel.
 -p, --load-panic     Load the new kernel for use on panic.
 -u, --unload         Unload the current kexec target kernel.
 -e, --exec           Execute a currently loaded kernel.
 -t, --type=TYPE      Specify the new kernel is of this type.
     --mem-min=<addr> Specify the lowest memory addres to load code into.
     --mem-max=<addr> Specify the highest memory addres to load code into.

Supported kernel file types and options: 
multiboot-x86
    --command-line=STRING        Set the kernel command line to STRING.
    --module="MOD arg1 arg2..."  Load module MOD with command-line "arg1..."
                                 (can be used multiple times).
elf-x86
    --command-line=STRING Set the kernel command line to STRING
    --append=STRING       Set the kernel command line to STRING
    --initrd=FILE         Use FILE as the kernel's initial ramdisk.
    --ramdisk=FILE        Use FILE as the kernel's initial ramdisk.
    --args-linux          Pass linux kernel style options
    --args-elf            Pass elf boot notes
bzImage
-d, --debug               Enable debugging to help spot a failure.
    --real-mode           Use the kernels real mode entry point.
    --command-line=STRING Set the kernel command line to STRING.
    --append=STRING       Set the kernel command line to STRING.
    --initrd=FILE         Use FILE as the kernel's initial ramdisk.
    --ramdisk=FILE        Use FILE as the kernel's initial ramdisk.
beoboot-x86
-d, --debug               Enable debugging to help spot a failure.
    --real-mode           Use the kernels real mode entry point.
nbi-x86

Architecture options: 
     --reset-vga               Attempt to reset a standard vga device
     --serial=<port>           Specify the serial port for debug output
     --serial-baud=<buad_rate> Specify the serial port baud rate
     --console-vga             Enable the vga console
     --console-serial          Enable the serial console
     --elf32-core-headers      Prepare core headers in ELF32 format
     --elf64-core-headers      Prepare core headers in ELF64 format

Starting kdump:                                            [FAILED]



Expected results:

Start kdump service.


Additional info:

Comment 1 Neil Horman 2007-01-03 21:43:30 UTC
appears the problem  is that the kernel load routine is different for PAE
kernels than for regular kernels (PAE kernels are being detected as bzImage
kernels, while non PAE kernels are being detected as elf-x86 kernels.  The
underlying cause of the problem is that the option parser for bzimage kernels
doesn't know about the --args-linux options (or several others for that matter).
 I can patch that pretty quick, but first I want to understand why the
PAE/non-PAE kernels are getting detected differently.  It seems to me like PAE
shouldn't have an effect on that sort of thing.

Comment 2 William Cohen 2007-01-03 22:23:07 UTC
Does the x86_64 use the same code? This problem also occurs on x86_64 rawhide.

Comment 3 Neil Horman 2007-01-04 19:55:37 UTC
Created attachment 144834 [details]
patched version of kexec-tools to account for extra arch args

so, it turns out this is happening because of an apparent kernel change, and
kexec is at least trying to do the right thing.  RHEL5 vmlinuz binaries are
detected as ELF files (the first three bytes of the file are "\177ELF"(this
applies to all kernels). All rawhide kernel vmlinuz images are bzImages, and
are not marked as ELF files.  Not sure how or why this change was made, but
regardless, its causing a different load method (comon to all x86 and x86_64
arches) to be used.  This load method is unaware of the args-linux kernel
argument, and as such, fails to operate properly.  The attached RPM fixes that
problem.  Unfortunately on the x86 test system, it still fails to load due to
an inabilty to find sufficient memory to load the kernel.  I'm concerned
however, that this may be due to the limited amount of memory on that system. 
So Will, could you please test this kexec-tools package out on a rawhide system
of yours with more memory available to a() ensure the unrecognized argument
problem is gone and (b) to ensure that given sufficient memory, the vmlinz file
can be loaded into memory.  Thanks!

Comment 4 William Cohen 2007-01-04 20:30:29 UTC
I tried updated kexec-tools on the x86_64 machine. It failed in a different way
than before. The /etc/init.d/kdump calls kexec and gets farther than it did
before. Looked at what the command line is when the kexec runs and get the
following:


# /sbin/kexec --args-linux -p '--command-line=ro root=/dev/VolGroup00/LogVol00
rhgb quiet  irqpoll maxcpus=1' --initrd=/boot/initrd-2.6.19-1.2904.fc7kdump.img
/boot/vmlinuz-2.6.19-1.2904.fc7
Could not find a free area of memory of 9000 bytes...
locate_hole failed

So there is still a problem.



Comment 5 Neil Horman 2007-01-04 21:46:29 UTC
yes, I told you there might be.  What is your crashdump line set to on your
x86_64 box and how much total system memory do you have in it?

Comment 6 William Cohen 2007-01-04 22:01:23 UTC
The machine has 1GB of memory. The /boot/grub/grub.conf has the following:

title Fedora Core (2.6.19-1.2904.fc7)
        root (hd0,1)
        kernel /vmlinuz-2.6.19-1.2904.fc7 ro root=/dev/VolGroup00/LogVol00 rhgb 
quiet crashkernel=128M@16M

$ free
             total       used       free     shared    buffers     cached
Mem:        883436     200496     682940          0      23788      87344
-/+ buffers/cache:      89364     794072
Swap:      2031608          0    2031608


Comment 7 Neil Horman 2007-01-05 19:00:55 UTC
found the upstream patch that fixes the ability to load relocatable bzimages. 
fixed in -56.fc7

Comment 8 David Hunter 2007-08-16 03:16:56 UTC
Happens to me on Rawhide on the i386/i686 kernels. How to fix it?

Comment 9 Neil Horman 2007-08-16 13:27:52 UTC
See comment #7.  You need to run at least kexec-tools-1.101-56.fc7

Comment 10 Bug Zapper 2008-04-03 18:50:45 UTC
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 11 Bug Zapper 2008-05-07 01:04:12 UTC
This bug has been in NEEDINFO for more than 30 days since feedback was
first requested. As a result we are closing it.

If you can reproduce this bug in the future against a maintained Fedora
version please feel free to reopen it against that version.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp