Description of problem: When attempting to set up a rawhide machine (both x86_64 and i686) using the steps listed on http://fedoraproject.org/wiki/FC6KdumpKexecHowTo, the "service kdump start" fails on the machine. The kexec command prints out a list of available options. It looks like kexec doesn't understand the arguments being passed to it from /etc/init.d/kdump. Version-Release number of selected component (if applicable): # uname -a Linux trek.devel.redhat.com 2.6.19-1.2891.fc7PAE #1 SMP Thu Dec 21 11:15:27 EST 2006 i686 i686 i386 GNU/Linux # rpm -q kexec-tools kexec-tools-1.101-55.fc7 How reproducible: Everytime Steps to Reproduce: 1. Install the steps in http://fedoraproject.org/wiki/FC6KdumpKexecHowTo 2. Enter "service kdump start" command Actual results: Get an the following output from the "service kdump start" # service kdump start kexec 1.101 released 15 February 2005 Usage: kexec [OPTION]... [kernel] Directly reboot into a new kernel -h, --help Print this help. -v, --version Print the version of kexec. -f, --force Force an immediate kexec, don't call shutdown. -x, --no-ifdown Don't bring down network interfaces. (if used, must be last option specified) -l, --load Load the new kernel into the current kernel. -p, --load-panic Load the new kernel for use on panic. -u, --unload Unload the current kexec target kernel. -e, --exec Execute a currently loaded kernel. -t, --type=TYPE Specify the new kernel is of this type. --mem-min=<addr> Specify the lowest memory addres to load code into. --mem-max=<addr> Specify the highest memory addres to load code into. Supported kernel file types and options: multiboot-x86 --command-line=STRING Set the kernel command line to STRING. --module="MOD arg1 arg2..." Load module MOD with command-line "arg1..." (can be used multiple times). elf-x86 --command-line=STRING Set the kernel command line to STRING --append=STRING Set the kernel command line to STRING --initrd=FILE Use FILE as the kernel's initial ramdisk. --ramdisk=FILE Use FILE as the kernel's initial ramdisk. --args-linux Pass linux kernel style options --args-elf Pass elf boot notes bzImage -d, --debug Enable debugging to help spot a failure. --real-mode Use the kernels real mode entry point. --command-line=STRING Set the kernel command line to STRING. --append=STRING Set the kernel command line to STRING. --initrd=FILE Use FILE as the kernel's initial ramdisk. --ramdisk=FILE Use FILE as the kernel's initial ramdisk. beoboot-x86 -d, --debug Enable debugging to help spot a failure. --real-mode Use the kernels real mode entry point. nbi-x86 Architecture options: --reset-vga Attempt to reset a standard vga device --serial=<port> Specify the serial port for debug output --serial-baud=<buad_rate> Specify the serial port baud rate --console-vga Enable the vga console --console-serial Enable the serial console --elf32-core-headers Prepare core headers in ELF32 format --elf64-core-headers Prepare core headers in ELF64 format Starting kdump: [FAILED] Expected results: Start kdump service. Additional info:
appears the problem is that the kernel load routine is different for PAE kernels than for regular kernels (PAE kernels are being detected as bzImage kernels, while non PAE kernels are being detected as elf-x86 kernels. The underlying cause of the problem is that the option parser for bzimage kernels doesn't know about the --args-linux options (or several others for that matter). I can patch that pretty quick, but first I want to understand why the PAE/non-PAE kernels are getting detected differently. It seems to me like PAE shouldn't have an effect on that sort of thing.
Does the x86_64 use the same code? This problem also occurs on x86_64 rawhide.
Created attachment 144834 [details] patched version of kexec-tools to account for extra arch args so, it turns out this is happening because of an apparent kernel change, and kexec is at least trying to do the right thing. RHEL5 vmlinuz binaries are detected as ELF files (the first three bytes of the file are "\177ELF"(this applies to all kernels). All rawhide kernel vmlinuz images are bzImages, and are not marked as ELF files. Not sure how or why this change was made, but regardless, its causing a different load method (comon to all x86 and x86_64 arches) to be used. This load method is unaware of the args-linux kernel argument, and as such, fails to operate properly. The attached RPM fixes that problem. Unfortunately on the x86 test system, it still fails to load due to an inabilty to find sufficient memory to load the kernel. I'm concerned however, that this may be due to the limited amount of memory on that system. So Will, could you please test this kexec-tools package out on a rawhide system of yours with more memory available to a() ensure the unrecognized argument problem is gone and (b) to ensure that given sufficient memory, the vmlinz file can be loaded into memory. Thanks!
I tried updated kexec-tools on the x86_64 machine. It failed in a different way than before. The /etc/init.d/kdump calls kexec and gets farther than it did before. Looked at what the command line is when the kexec runs and get the following: # /sbin/kexec --args-linux -p '--command-line=ro root=/dev/VolGroup00/LogVol00 rhgb quiet irqpoll maxcpus=1' --initrd=/boot/initrd-2.6.19-1.2904.fc7kdump.img /boot/vmlinuz-2.6.19-1.2904.fc7 Could not find a free area of memory of 9000 bytes... locate_hole failed So there is still a problem.
yes, I told you there might be. What is your crashdump line set to on your x86_64 box and how much total system memory do you have in it?
The machine has 1GB of memory. The /boot/grub/grub.conf has the following: title Fedora Core (2.6.19-1.2904.fc7) root (hd0,1) kernel /vmlinuz-2.6.19-1.2904.fc7 ro root=/dev/VolGroup00/LogVol00 rhgb quiet crashkernel=128M@16M $ free total used free shared buffers cached Mem: 883436 200496 682940 0 23788 87344 -/+ buffers/cache: 89364 794072 Swap: 2031608 0 2031608
found the upstream patch that fixes the ability to load relocatable bzimages. fixed in -56.fc7
Happens to me on Rawhide on the i386/i686 kernels. How to fix it?
See comment #7. You need to run at least kexec-tools-1.101-56.fc7
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again.
This bug has been in NEEDINFO for more than 30 days since feedback was first requested. As a result we are closing it. If you can reproduce this bug in the future against a maintained Fedora version please feel free to reopen it against that version. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp