Bug 607400 - UV support: kexec command: extend for large cpu count and memory
Summary: UV support: kexec command: extend for large cpu count and memory
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kexec-tools   
(Show other bugs)
Version: 6.0
Hardware: All
OS: Linux
Target Milestone: rc
: 6.1
Assignee: Cong Wang
QA Contact: Chao Ye
Depends On: 619426 650298
Blocks: 580566 645474
TreeView+ depends on / blocked
Reported: 2010-06-24 02:14 UTC by George Beshers
Modified: 2013-09-30 02:18 UTC (History)
10 users (show)

Fixed In Version: kexec-tools-2_0_0-172_el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2011-05-19 14:15:15 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Tar bz2 file of patches and a series file. (3.71 KB, text/plain)
2011-03-08 21:18 UTC, George Beshers
no flags Details

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0736 normal SHIPPED_LIVE kexec-tools bug fix update 2011-05-18 18:09:18 UTC

Description George Beshers 2010-06-24 02:14:10 UTC

I didn't actually check as these went upstream very recently,
so they might be in the package already.


Description of problem:
A couple fixes are needed to the kexec command to make dumps work on UV.

The MAX_MEMORY_RANGES of 64 is too small for a very large NUMA machine.
(A 512 processor SGI UV, for example.)
And fix a temporary workaround (hack) in load_crashdump_segments() that
assumes that 16k is sufficient for the size of the crashdump elf header.
This is too small for a machine with a large cpu count. A PT_NOTE is created
in the elf header for each cpu.

This first patch looks like this:

Index: kexec-tools-2.0.1/kexec/arch/i386/kexec-x86.h
--- kexec-tools-2.0.1.orig/kexec/arch/i386/kexec-x86.h
+++ kexec-tools-2.0.1/kexec/arch/i386/kexec-x86.h
@@ -1,7 +1,7 @@
 #ifndef KEXEC_X86_H
 #define KEXEC_X86_H

+#define MAX_MEMORY_RANGES 1024

 enum coretype {
        CORE_TYPE_UNDEF = 0,h
Index: kexec-tools-2.0.1/kexec/arch/x86_64/crashdump-x86_64.c
--- kexec-tools-2.0.1.orig/kexec/arch/x86_64/crashdump-x86_64.c
+++ kexec-tools-2.0.1/kexec/arch/x86_64/crashdump-x86_64.c
@@ -268,6 +268,9 @@ static int exclude_region(int *nr_ranges
        int i, j, tidx = -1;
        struct memory_range temp_region;
+       temp_region.start = 0;
+       temp_region.end = 0;
+       temp_region.type = 0;

        for (i = 0; i < (*nr_ranges); i++) {
                unsigned long long mstart, mend;
@@ -403,6 +406,7 @@ static int delete_memmap(struct memory_r
                                memmap_p[i].end = addr - 1;
                                temp_region.start = addr + size;
                                temp_region.end = mend;
+                               temp_region.type = memmap_p[i].type;
                                operation = 1;
                                tidx = i;
@@ -580,7 +584,7 @@ int load_crashdump_segments(struct kexec
                                unsigned long max_addr, unsigned long min_base)
        void *tmp;
-       unsigned long sz, elfcorehdr;
+       unsigned long sz, bufsz, memsz, elfcorehdr;
        int nr_ranges, align = 1024, i;
        struct memory_range *mem_range, *memmap_p;

@@ -613,9 +617,10 @@ int load_crashdump_segments(struct kexec
        /* Create elf header segment and store crash image data. */
        if (crash_create_elf64_headers(info, &elf_info,
                                       crash_memory_range, nr_ranges,
-                                      &tmp, &sz,
+                                      &tmp, &bufsz,
                                       ELF_CORE_HEADER_ALIGN) < 0)
                return -1;
+       /* the size of the elf headers allocated is returned in 'bufsz' */

        /* Hack: With some ld versions (GNU ld version 20030523),
         * vmlinux program headers show a gap of two pages between bss segment
@@ -624,9 +629,15 @@ int load_crashdump_segments(struct kexec
         * elf core header segment to 16K to avoid being placed in such gaps.
         * This is a makeshift solution until it is fixed in kernel.
-       elfcorehdr = add_buffer(info, tmp, sz, 16*1024, align, min_base,
+       if (bufsz < (16*1024))
+               /* bufsize is big enough for all the PT_NOTE's and PT_LOAD's */
+               memsz = 16*1024;
+               /* memsz will be the size of the memory hole we look for */
+       else
+               memsz = bufsz;
+       elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
                                                        max_addr, -1);
-       if (delete_memmap(memmap_p, elfcorehdr, sz) < 0)
+       if (delete_memmap(memmap_p, elfcorehdr, memsz) < 0)
                return -1;
        cmdline_add_memmap(mod_cmdline, memmap_p);
        cmdline_add_elfcorehdr(mod_cmdline, elfcorehdr);

and the other to prevent some rather verbose kexec grumbling:

Index: kexec-tools/kexec/firmware_memmap.c
--- kexec-tools.orig/kexec/firmware_memmap.c
+++ kexec-tools/kexec/firmware_memmap.c
@@ -161,6 +161,8 @@ static int parse_memmap_entry(const char
                range->type = RANGE_RAM;
        else if (strcmp(type, "ACPI Tables") == 0)
                range->type = RANGE_ACPI;
+       else if (strcmp(type, "Unusable memory") == 0)
+               range->type = RANGE_RESERVED;
        else if (strcmp(type, "reserved") == 0)
                range->type = RANGE_RESERVED;
        else if (strcmp(type, "Unusable memory") == 0)

Both have been applied upstream.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:

Expected results:

Additional info:

Comment 2 RHEL Product and Program Management 2010-06-24 02:32:51 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for

Comment 4 Marizol Martinez 2010-07-08 15:31:02 UTC
George -- Per your comment in the description, please verify and update this BZ accordingly. Thanks!

Comment 5 Cong Wang 2010-07-09 08:05:24 UTC
George, please either provide the patch as an attachment or give me the upstream commit ID's, please don't inline the patch in BZ, it is unusable.

Also, have you tested it?

Comment 6 George Beshers 2010-07-20 16:30:53 UTC

Sorry, our internal bug system doesn't have the attachment capability
and I did a cut-and-paste.

We found another problem in the kernel with kdump.
I am planning on testing this on a 5Tb system tomorrow (7/21).


Comment 8 George Beshers 2010-07-29 13:42:57 UTC

I ran across a couple of completely different bugs testing this.

Also, we are making a large (1024core) system available
to RedHat on Tuesdays.  It did not happen this last Tuesday
because of a problem booting the system.


Comment 11 Cong Wang 2010-08-09 04:32:30 UTC
commit 4b4b2a533e218e287ab4aed25678434ad938309e
Author: Cliff Wickman <cpw@sgi.com>
Date:   Wed Jun 16 08:36:09 2010 -0500

    kexec: extend for large cpu count and memory
commit 26ed909df48ea3db3f7395713a9c68c94d091032
Author: Cliff Wickman <cpw@sgi.com>
Date:   Thu Jun 17 11:37:06 2010 -0500

    kexec: Unusable memory range type

Are the above two commits all what we need? It seems I am still missing some other commit?

Comment 12 George Beshers 2010-08-09 16:31:48 UTC
Hi Amerigo,

I believe that those are the only two patches we need,
although to actually do a dump we can't really dump a full
5Tb.  Our suggestion is to set the debug level to 31 which
should provide a great deal of useful information if there
is a problem in the field with rhel6.

In any case, SGI is making a large system available to RedHat
this evening until early Wed morning.  I am hoping to find
time in that period to test kdump.


Comment 13 Cong Wang 2010-08-10 10:27:58 UTC
George, Okay, we already use '-d 31' by default now.
I am waiting for your testing result. Thanks!

Comment 14 Cong Wang 2010-08-11 10:16:41 UTC
I built a test package:

Comment 17 George Beshers 2010-08-18 14:17:37 UTC
The makedumpfile command worked with our modified kexec based on 2.0.1.
However, the modified kexec did not work.

I am currently on my third patch to try to fix the problem.


Comment 18 George Beshers 2010-08-19 18:38:47 UTC
To clarify the situation.

I asked another SGI engineer for help with this patch.
The patch does work, but against a later version of the
kexec-tools from upstream.

It was my mistake to pass the patch along without
personally testing it.  I worked this last weekend
to try to fix the patch.


Comment 19 Linda Wang 2010-08-20 02:28:41 UTC
thank you testing the package, and for the clarification..

so, which patch(es) from upstream kexec-tools is missing
other than the two patches listed in comment#11 above?

Comment 20 George Beshers 2010-08-20 12:37:49 UTC
I have requested help from another SGI engineer with this
and will be careful to test the patched rpm on the 1024 core
5Tb machine that we make available to RedHat on a weekly basis.


Comment 21 Marizol Martinez 2010-08-20 13:13:53 UTC
George -- I believe Linda's Q on comment #20 is still outstanding. Could you please update this BZ with the specific patches the upstream version has vs. RH's? Thanks!

Comment 22 George Beshers 2011-02-25 21:37:55 UTC

We finally found the problem with kexec-tools and the e820
table -- it manifested itself as a memory corruption in
the running kernel.

I am currently cleaning up the patchset -- the last patch
is upstream.


Comment 24 George Beshers 2011-03-08 21:18:23 UTC
Created attachment 483023 [details]
Tar bz2 file of patches and a series file.

Up to a few comment cleanups this is what was built


I have verified that this works on a number of UV systems.

The filo is a bzip2 tar file of a quilt patches directory


Comment 25 Cong Wang 2011-03-15 09:02:32 UTC
Ok, finally I get the tar ball. One question, are all these patches in upstream?

And I do appreciate that your patches attached are against latest RHEL-6 kexec-tools, this would save me much time to handle conflicts. Anyway, I will try to see if this is true. :)


Comment 26 gbeshers 2011-03-15 17:15:21 UTC
Hi Amerigo,

I added Cliff Wickman to the CC list.  He indicated that
they all were and I found most of them.  A few had been
partially applied and IIRC one I was unsure about because
some of the code had been rewritten and moved.

Let me know when you are ready to test and I will grab
a big system.


Comment 28 Cong Wang 2011-03-16 12:23:22 UTC
Thanks, George.

There are some problems from my eyes:

1. Not all commits matches in your patchset description, e.g. in kexec_segs_ranges,

Backport of commit 563ee341d950f2fae0ba6608d70c19eb647ff943
and commit 7b325f8528d230e50a0c3841a3ac587dea2200e2
just for crashdump-x86_64.c which doesn't exist upstream.

Neither of them matches that patch.

2. For 100823.kcore_header_patch, probably we need to backport my patch

commit 1100580b05e3fdfe648d9be8617d962b11f4b88b
Author: Amerigo Wang <amwang@redhat.com>
Date:   Thu Mar 3 00:10:43 2011 +0800

    get the backup area dynamically

Anyway, I will build a kexec-tools package with all of your patches except 100823.kcore_header_patch, plus the backport of 1100580b05e3fdfe648d9be8617d962b11f4b88b for you to test.

Comment 29 Cong Wang 2011-03-16 13:04:00 UTC
George, please help to test this one:


Comment 30 Cong Wang 2011-03-16 13:20:50 UTC
Hmm, please use this one instead:

Comment 31 George Beshers 2011-03-18 19:52:07 UTC
Hi Amerigo,

Interestingly enough if I take the x86_64 rpm that fails,
but if I rebuild the source rpm on the system I am testing
(I was trying to locate the problem) then it does work.

Possibly a problem with the Brew root?


Comment 32 Cong Wang 2011-03-21 05:47:37 UTC
Oh, maybe, I made the srpm locally and send it to brew to build. Anyway, I take all the patches. Please try


to see if this rpm is okay.


Comment 35 gbeshers 2011-03-23 13:35:15 UTC
Seems to be OK, but I haven't tested on
a 2 rack system yet.


Comment 36 errata-xmlrpc 2011-05-19 14:15:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.