Bug 217780

Summary: Kexec-tools: Generate either 32 or 64bit elf core headers depending on RAM present
Product: Red Hat Enterprise Linux 5 Reporter: Vivek Goyal <vgoyal>
Component: kexec-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: jnomura
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 5.0.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-13 17:28:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to detect need for 32/64 bit core header on x86 systems
none
new patch to detect need for 32/64 bit core header on x86 systems
none
another new patch to detect need for 32/64 bit core header on x86 systems none

Description Vivek Goyal 2006-11-29 21:32:48 UTC
Description of problem:

kexec provides the facility to generate either 32bit or 64bit elf headers using
command line option --elf32-core-headers and --elf64-core-headers. 32bit elf
headers can't represent all the physical memory present in the system if it is a
PAE system and has got more than 4G of RAM. In that case we need to generate
64bit elf headers.

Otherwise we can generate 32bit elf headers and that also has got the advantage
that this core can be opened using gdb also.

Probably, we should determine at run time if system has got more than 4G of RAM
and PAE is enabled, then generate 64bit headers otherwise continue to generate
32bit core headers.

Version-Release number of selected component (if applicable):

1.101-136.el5

How reproducible:

always

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jun'ichi Nomura (Red Hat) 2006-11-29 21:55:59 UTC
If the system has more than 4GB of RAM,
I think ELF64 should be chosen regardless of the kernel being non-PAE,
because in such case the size of /proc/vmcore is based on /proc/iomem
and not on kernel-visible memory size.
So please check /proc/iomem for the decision.
(Or fix /proc/vmcore?)

# uname -a
Linux nec-em2.lab.boston.redhat.com 2.6.18-1.2747.el5 #1 SMP Thu Nov 9 18:55:30
EST 2006 i686 i686 i386 GNU/Linux
# free
             total       used       free     shared    buffers     cached
Mem:       3237916    3034792     203124          0     132768    2721852
-/+ buffers/cache:     180172    3057744
Swap:      2040212        160    2040052
# ls -l vmcore 
-r-------- 1 root root 6307251076 Nov 27 17:03 vmcore
# cat /proc/iomem |grep RAM
00000000-0009b3ff : System RAM
000a0000-000bffff : Video RAM area
00100000-cff6ffff : System RAM
100000000-1afffffff : System RAM


Comment 2 Vivek Goyal 2006-11-29 23:16:56 UTC
Determining what physical memory areas are being used by kernel is difficult
that's why kexec plays safe and generates the header for all the physical memory
visible from /proc/iomem

/proc/meminfo will just give total of available memory. It leaves out reserved
pages. so we never know which memory areas have been left out. That's why fixing
kexec would be tough.

In practice, I don't expect a cusotmer to be having more physical memory but
running a kernel which does not use that memory.

So I think looking at /proc/iomem and deciding whether to generate 32bit or
64bit headers might be the way forward. In this case it can be done in kexec
code itself instead of doing it in init scripts.

Comment 3 Neil Horman 2006-11-30 20:25:08 UTC
Created attachment 142516 [details]
patch to detect need for 32/64 bit core header on x86 systems

I agree that /proc/iomem is the way to make this determination, but I think
that computing it from the initscript will be a little cleaner.  We already
have the --elf[32|64]-core-header option in kexec, and if we make the
determination independently by reading /proc/iomem from
get_crash_memory_ranges() or someplace simmilar, I'm worried that we will cause
confusion (i.e. does the command line option override the decision made by
tallying /proc/iomem, etc).  If we do this comuptation in the initscript, then
inform kexec of our decision via the command line options, then people can see
exactly whats going on, and override it easily if they have need to.

The attached patch preforms the computation in the initscript, and I think it
looks reasonable.  Please give it a spin and let me know what you all think.
Thanks!

Comment 4 Vivek Goyal 2006-11-30 20:50:12 UTC
Doing it in init scripts is also fine with me. Few queries with attached patch.

+	     /.*RAM.*/ {
+		start = strtonum($1);
+		end = strtonum($2);
+		segmentmem=end-start;
+		totalmem=totalmem+(segmentmem/(1024*1024));

Not necessarily all the memory segment are on 1MB boundary. Many are in 1K
boundary on /proc/iomem. above statement will truncate these segments.

+	     }
+	     END {
+		printf "%d", (totalmem+1);
+	     }'`
+
+	if [ $MEMSZ -ge 4000 ]

I think you meant 4096 here? I think we can total all the memory and then
compare with 4096*1024*1024.


+	then
+		return 1
+	fi
+	return 0
+}
+
 # Load the kdump kerel specified in /etc/sysconfig/kdump
 # If none is specified, try to load a kdump kernel with the same version
 # as the currently running kernel.
 function load_kdump()
 {
+
+	ARCH=`uname -m`
+	if [ "$ARCH" == "i686" -o "$ARCH" == "i386" ]
+	then
+		need_64bit_headers
+		if [ $? == 1 ]
+		then
+			KEXEC_ARGS="$KEXEC_ARGS --elf64-core-headers"

What happens if user has already provided an option --elf32-core-headers in
/etc/sysconfig/kdump file? I guess we should check for the presence of that
option and if user has enforced a policy regarding elf headers then we should
respect that and not try to do the calculation of our own.



Comment 5 Neil Horman 2006-12-01 19:06:18 UTC
Created attachment 142606 [details]
new patch to detect need for 32/64 bit core header on x86 systems

here you go.  I think this variant addresses all of your concerns, vivek. 
Please take a look, and give it a spin.  If you agree, I'll go ahead and check
it in.	Thanks

Comment 6 Vivek Goyal 2006-12-01 19:30:10 UTC
Hi Neil,

couple of more queries. 

+               need_64bit_headers
+               if [ $? == 1 ]
+               then
+                       FOUND_ELF_ARGS=`echo $KDUMP_COMMANDLINE
$KDUMP_COMMANDLINE_APPEND | grep elf32-core-headers`
+                       if [ -z "$FOUND_ELF_ARGS" ]

Does -z means zero? If yes, then are we not checking for reverse condition? By
now you must have found out how poor I am at scripting :-)


+                       then
+                               echo -n "Warning: elf32-core-headers overrides
appropriate setting"
+                               warning
+                               echo
+                       fi
+                       KEXEC_ARGS="$KEXEC_ARGS --elf64-core-headers"
+               else
+                       KEXEC_ARGS="$KEXEC_ARGS --elf32-core-headers"

What happens if somebody has specified --elf64-core-headers in conf file?

I think we should finally honour what user has specified in conf file. We can 
at max warn him if user has specified --elf32-core-headers and system requires
--elf54-core-headers (because memory is more than 4G).

A message something like:

"Warning. Kernel core dump will be truncated as system requires 64bit elf core
headers to represent whole of the RAM. But configuration file forces 32bit elf
core headers"

Comment 7 Neil Horman 2006-12-01 20:15:28 UTC
Created attachment 142615 [details]
another new patch to detect need for 32/64 bit core header on x86 systems

good catch on the -z issue vivek.  It should be -n, and I should have a
simmilar clause in the 64 bit case.  This new patch should honor whats in the
config file over what we determine by computation, and will issue a warning in
the event that we specify --elf32-core-headers if we have more than 4GB of RAM.
I don't think we need to warn in the case that we generate 64 bit headers on
<4GB ram, since those cores will still work fine with crash.  Let me know what
you think

Comment 8 Vivek Goyal 2006-12-01 20:59:32 UTC
Neil, it looks good to me.

Comment 9 Jun'ichi NOMURA 2006-12-01 21:40:42 UTC
Neil,
This will break non-i686 arch without KDUMP_COMMANDLINE specified.

-	if [ -z "$KDUMP_COMMANDLINE" ]; then
-		KDUMP_COMMANDLINE=`cat /proc/cmdline`
<snip>
+	if [ "$ARCH" == "i686" -o "$ARCH" == "i386" ]
+	then
+		if [ -z "$KDUMP_COMMANDLINE" ]
+		then
+			KDUMP_COMMANDLINE=`cat /proc/cmdline`
+		fi
<snip>
+	KDUMP_COMMANDLINE=`echo $KDUMP_COMMANDLINE | sed -e
's/crashkernel=[0-9]\+M@[0-9]\+M//g'`
+	KDUMP_COMMANDLINE="${KDUMP_COMMANDLINE} ${KDUMP_COMMANDLINE_APPEND}"

Comment 10 Neil Horman 2006-12-02 02:20:19 UTC
dang. you're right.  Thanks for the catch, I'll fix that before I check it in.

Comment 11 Neil Horman 2006-12-04 14:54:19 UTC
fixed in -141.el5.  Thanks all!

Comment 12 Jun'ichi NOMURA 2006-12-04 19:28:55 UTC
Neil,

I tried -141.el5 and found the following script is not correct:

+                       FOUND_ELF_ARGS=`echo $KDUMP_COMMANDLINE
$KDUMP_COMMANDLINE_APPEND | grep elf32-core-headers`
+                       if [ -z "$FOUND_ELF_ARGS" ]

You need to grep $KEXEC_ARGS, not $KDUMP_COMMANDLINE{,_APPEND}.


Comment 13 Neil Horman 2006-12-04 19:49:57 UTC
you're right.  It'll be fixed in -142.  Thanks!


Comment 14 Jay Turner 2007-02-13 17:28:35 UTC
kexec-tools-1.101-164.el5 included in 20070208.0.