Description of problem: kexec provides the facility to generate either 32bit or 64bit elf headers using command line option --elf32-core-headers and --elf64-core-headers. 32bit elf headers can't represent all the physical memory present in the system if it is a PAE system and has got more than 4G of RAM. In that case we need to generate 64bit elf headers. Otherwise we can generate 32bit elf headers and that also has got the advantage that this core can be opened using gdb also. Probably, we should determine at run time if system has got more than 4G of RAM and PAE is enabled, then generate 64bit headers otherwise continue to generate 32bit core headers. Version-Release number of selected component (if applicable): 1.101-136.el5 How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
If the system has more than 4GB of RAM, I think ELF64 should be chosen regardless of the kernel being non-PAE, because in such case the size of /proc/vmcore is based on /proc/iomem and not on kernel-visible memory size. So please check /proc/iomem for the decision. (Or fix /proc/vmcore?) # uname -a Linux nec-em2.lab.boston.redhat.com 2.6.18-1.2747.el5 #1 SMP Thu Nov 9 18:55:30 EST 2006 i686 i686 i386 GNU/Linux # free total used free shared buffers cached Mem: 3237916 3034792 203124 0 132768 2721852 -/+ buffers/cache: 180172 3057744 Swap: 2040212 160 2040052 # ls -l vmcore -r-------- 1 root root 6307251076 Nov 27 17:03 vmcore # cat /proc/iomem |grep RAM 00000000-0009b3ff : System RAM 000a0000-000bffff : Video RAM area 00100000-cff6ffff : System RAM 100000000-1afffffff : System RAM
Determining what physical memory areas are being used by kernel is difficult that's why kexec plays safe and generates the header for all the physical memory visible from /proc/iomem /proc/meminfo will just give total of available memory. It leaves out reserved pages. so we never know which memory areas have been left out. That's why fixing kexec would be tough. In practice, I don't expect a cusotmer to be having more physical memory but running a kernel which does not use that memory. So I think looking at /proc/iomem and deciding whether to generate 32bit or 64bit headers might be the way forward. In this case it can be done in kexec code itself instead of doing it in init scripts.
Created attachment 142516 [details] patch to detect need for 32/64 bit core header on x86 systems I agree that /proc/iomem is the way to make this determination, but I think that computing it from the initscript will be a little cleaner. We already have the --elf[32|64]-core-header option in kexec, and if we make the determination independently by reading /proc/iomem from get_crash_memory_ranges() or someplace simmilar, I'm worried that we will cause confusion (i.e. does the command line option override the decision made by tallying /proc/iomem, etc). If we do this comuptation in the initscript, then inform kexec of our decision via the command line options, then people can see exactly whats going on, and override it easily if they have need to. The attached patch preforms the computation in the initscript, and I think it looks reasonable. Please give it a spin and let me know what you all think. Thanks!
Doing it in init scripts is also fine with me. Few queries with attached patch. + /.*RAM.*/ { + start = strtonum($1); + end = strtonum($2); + segmentmem=end-start; + totalmem=totalmem+(segmentmem/(1024*1024)); Not necessarily all the memory segment are on 1MB boundary. Many are in 1K boundary on /proc/iomem. above statement will truncate these segments. + } + END { + printf "%d", (totalmem+1); + }'` + + if [ $MEMSZ -ge 4000 ] I think you meant 4096 here? I think we can total all the memory and then compare with 4096*1024*1024. + then + return 1 + fi + return 0 +} + # Load the kdump kerel specified in /etc/sysconfig/kdump # If none is specified, try to load a kdump kernel with the same version # as the currently running kernel. function load_kdump() { + + ARCH=`uname -m` + if [ "$ARCH" == "i686" -o "$ARCH" == "i386" ] + then + need_64bit_headers + if [ $? == 1 ] + then + KEXEC_ARGS="$KEXEC_ARGS --elf64-core-headers" What happens if user has already provided an option --elf32-core-headers in /etc/sysconfig/kdump file? I guess we should check for the presence of that option and if user has enforced a policy regarding elf headers then we should respect that and not try to do the calculation of our own.
Created attachment 142606 [details] new patch to detect need for 32/64 bit core header on x86 systems here you go. I think this variant addresses all of your concerns, vivek. Please take a look, and give it a spin. If you agree, I'll go ahead and check it in. Thanks
Hi Neil, couple of more queries. + need_64bit_headers + if [ $? == 1 ] + then + FOUND_ELF_ARGS=`echo $KDUMP_COMMANDLINE $KDUMP_COMMANDLINE_APPEND | grep elf32-core-headers` + if [ -z "$FOUND_ELF_ARGS" ] Does -z means zero? If yes, then are we not checking for reverse condition? By now you must have found out how poor I am at scripting :-) + then + echo -n "Warning: elf32-core-headers overrides appropriate setting" + warning + echo + fi + KEXEC_ARGS="$KEXEC_ARGS --elf64-core-headers" + else + KEXEC_ARGS="$KEXEC_ARGS --elf32-core-headers" What happens if somebody has specified --elf64-core-headers in conf file? I think we should finally honour what user has specified in conf file. We can at max warn him if user has specified --elf32-core-headers and system requires --elf54-core-headers (because memory is more than 4G). A message something like: "Warning. Kernel core dump will be truncated as system requires 64bit elf core headers to represent whole of the RAM. But configuration file forces 32bit elf core headers"
Created attachment 142615 [details] another new patch to detect need for 32/64 bit core header on x86 systems good catch on the -z issue vivek. It should be -n, and I should have a simmilar clause in the 64 bit case. This new patch should honor whats in the config file over what we determine by computation, and will issue a warning in the event that we specify --elf32-core-headers if we have more than 4GB of RAM. I don't think we need to warn in the case that we generate 64 bit headers on <4GB ram, since those cores will still work fine with crash. Let me know what you think
Neil, it looks good to me.
Neil, This will break non-i686 arch without KDUMP_COMMANDLINE specified. - if [ -z "$KDUMP_COMMANDLINE" ]; then - KDUMP_COMMANDLINE=`cat /proc/cmdline` <snip> + if [ "$ARCH" == "i686" -o "$ARCH" == "i386" ] + then + if [ -z "$KDUMP_COMMANDLINE" ] + then + KDUMP_COMMANDLINE=`cat /proc/cmdline` + fi <snip> + KDUMP_COMMANDLINE=`echo $KDUMP_COMMANDLINE | sed -e 's/crashkernel=[0-9]\+M@[0-9]\+M//g'` + KDUMP_COMMANDLINE="${KDUMP_COMMANDLINE} ${KDUMP_COMMANDLINE_APPEND}"
dang. you're right. Thanks for the catch, I'll fix that before I check it in.
fixed in -141.el5. Thanks all!
Neil, I tried -141.el5 and found the following script is not correct: + FOUND_ELF_ARGS=`echo $KDUMP_COMMANDLINE $KDUMP_COMMANDLINE_APPEND | grep elf32-core-headers` + if [ -z "$FOUND_ELF_ARGS" ] You need to grep $KEXEC_ARGS, not $KDUMP_COMMANDLINE{,_APPEND}.
you're right. It'll be fixed in -142. Thanks!
kexec-tools-1.101-164.el5 included in 20070208.0.