Bug 607400 - UV support: kexec command: extend for large cpu count and memory
UV support: kexec command: extend for large cpu count and memory
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kexec-tools (Show other bugs)
6.0
All Linux
high Severity high
: rc
: 6.1
Assigned To: Cong Wang
Chao Ye
:
Depends On: 619426 650298
Blocks: 580566 645474
  Show dependency treegraph
 
Reported: 2010-06-23 22:14 EDT by George Beshers
Modified: 2013-09-29 22:18 EDT (History)
10 users (show)

See Also:
Fixed In Version: kexec-tools-2_0_0-172_el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-05-19 10:15:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Tar bz2 file of patches and a series file. (3.71 KB, text/plain)
2011-03-08 16:18 EST, George Beshers
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0736 normal SHIPPED_LIVE kexec-tools bug fix update 2011-05-18 14:09:18 EDT

  None (edit)
Description George Beshers 2010-06-23 22:14:10 EDT
David,

I didn't actually check as these went upstream very recently,
so they might be in the package already.

George


Description of problem:
A couple fixes are needed to the kexec command to make dumps work on UV.

The MAX_MEMORY_RANGES of 64 is too small for a very large NUMA machine.
(A 512 processor SGI UV, for example.)
And fix a temporary workaround (hack) in load_crashdump_segments() that
assumes that 16k is sufficient for the size of the crashdump elf header.
This is too small for a machine with a large cpu count. A PT_NOTE is created
in the elf header for each cpu.

This first patch looks like this:

Index: kexec-tools-2.0.1/kexec/arch/i386/kexec-x86.h
===================================================================
--- kexec-tools-2.0.1.orig/kexec/arch/i386/kexec-x86.h
+++ kexec-tools-2.0.1/kexec/arch/i386/kexec-x86.h
@@ -1,7 +1,7 @@
 #ifndef KEXEC_X86_H
 #define KEXEC_X86_H

-#define MAX_MEMORY_RANGES 64
+#define MAX_MEMORY_RANGES 1024

 enum coretype {
        CORE_TYPE_UNDEF = 0,h
Index: kexec-tools-2.0.1/kexec/arch/x86_64/crashdump-x86_64.c
===================================================================
--- kexec-tools-2.0.1.orig/kexec/arch/x86_64/crashdump-x86_64.c
+++ kexec-tools-2.0.1/kexec/arch/x86_64/crashdump-x86_64.c
@@ -268,6 +268,9 @@ static int exclude_region(int *nr_ranges
 {
        int i, j, tidx = -1;
        struct memory_range temp_region;
+       temp_region.start = 0;
+       temp_region.end = 0;
+       temp_region.type = 0;

        for (i = 0; i < (*nr_ranges); i++) {
                unsigned long long mstart, mend;
@@ -403,6 +406,7 @@ static int delete_memmap(struct memory_r
                                memmap_p[i].end = addr - 1;
                                temp_region.start = addr + size;
                                temp_region.end = mend;
+                               temp_region.type = memmap_p[i].type;
                                operation = 1;
                                tidx = i;
                                break;
@@ -580,7 +584,7 @@ int load_crashdump_segments(struct kexec
                                unsigned long max_addr, unsigned long min_base)
 {
        void *tmp;
-       unsigned long sz, elfcorehdr;
+       unsigned long sz, bufsz, memsz, elfcorehdr;
        int nr_ranges, align = 1024, i;
        struct memory_range *mem_range, *memmap_p;

@@ -613,9 +617,10 @@ int load_crashdump_segments(struct kexec
        /* Create elf header segment and store crash image data. */
        if (crash_create_elf64_headers(info, &elf_info,
                                       crash_memory_range, nr_ranges,
-                                      &tmp, &sz,
+                                      &tmp, &bufsz,
                                       ELF_CORE_HEADER_ALIGN) < 0)
                return -1;
+       /* the size of the elf headers allocated is returned in 'bufsz' */

        /* Hack: With some ld versions (GNU ld version 2.14.90.0.4 20030523),
         * vmlinux program headers show a gap of two pages between bss segment
@@ -624,9 +629,15 @@ int load_crashdump_segments(struct kexec
         * elf core header segment to 16K to avoid being placed in such gaps.
         * This is a makeshift solution until it is fixed in kernel.
         */
-       elfcorehdr = add_buffer(info, tmp, sz, 16*1024, align, min_base,
+       if (bufsz < (16*1024))
+               /* bufsize is big enough for all the PT_NOTE's and PT_LOAD's */
+               memsz = 16*1024;
+               /* memsz will be the size of the memory hole we look for */
+       else
+               memsz = bufsz;
+       elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
                                                        max_addr, -1);
-       if (delete_memmap(memmap_p, elfcorehdr, sz) < 0)
+       if (delete_memmap(memmap_p, elfcorehdr, memsz) < 0)
                return -1;
        cmdline_add_memmap(mod_cmdline, memmap_p);
        cmdline_add_elfcorehdr(mod_cmdline, elfcorehdr);


and the other to prevent some rather verbose kexec grumbling:

Index: kexec-tools/kexec/firmware_memmap.c
===================================================================
--- kexec-tools.orig/kexec/firmware_memmap.c
+++ kexec-tools/kexec/firmware_memmap.c
@@ -161,6 +161,8 @@ static int parse_memmap_entry(const char
                range->type = RANGE_RAM;
        else if (strcmp(type, "ACPI Tables") == 0)
                range->type = RANGE_ACPI;
+       else if (strcmp(type, "Unusable memory") == 0)
+               range->type = RANGE_RESERVED;
        else if (strcmp(type, "reserved") == 0)
                range->type = RANGE_RESERVED;
        else if (strcmp(type, "Unusable memory") == 0)

Both have been applied upstream.



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 2 RHEL Product and Program Management 2010-06-23 22:32:51 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 4 Marizol Martinez 2010-07-08 11:31:02 EDT
George -- Per your comment in the description, please verify and update this BZ accordingly. Thanks!
Comment 5 Cong Wang 2010-07-09 04:05:24 EDT
George, please either provide the patch as an attachment or give me the upstream commit ID's, please don't inline the patch in BZ, it is unusable.

Also, have you tested it?
Comment 6 George Beshers 2010-07-20 12:30:53 EDT
Amerigo,

Sorry, our internal bug system doesn't have the attachment capability
and I did a cut-and-paste.

We found another problem in the kernel with kdump.
I am planning on testing this on a 5Tb system tomorrow (7/21).

George
Comment 8 George Beshers 2010-07-29 09:42:57 EDT
Amerigo,

I ran across a couple of completely different bugs testing this.

Also, we are making a large (1024core) system available
to RedHat on Tuesdays.  It did not happen this last Tuesday
because of a problem booting the system.

George
Comment 11 Cong Wang 2010-08-09 00:32:30 EDT
commit 4b4b2a533e218e287ab4aed25678434ad938309e
Author: Cliff Wickman <cpw@sgi.com>
Date:   Wed Jun 16 08:36:09 2010 -0500

    kexec: extend for large cpu count and memory
    
-----------
commit 26ed909df48ea3db3f7395713a9c68c94d091032
Author: Cliff Wickman <cpw@sgi.com>
Date:   Thu Jun 17 11:37:06 2010 -0500

    kexec: Unusable memory range type
    
-----------

Are the above two commits all what we need? It seems I am still missing some other commit?
Comment 12 George Beshers 2010-08-09 12:31:48 EDT
Hi Amerigo,

I believe that those are the only two patches we need,
although to actually do a dump we can't really dump a full
5Tb.  Our suggestion is to set the debug level to 31 which
should provide a great deal of useful information if there
is a problem in the field with rhel6.

In any case, SGI is making a large system available to RedHat
this evening until early Wed morning.  I am hoping to find
time in that period to test kdump.

George
Comment 13 Cong Wang 2010-08-10 06:27:58 EDT
George, Okay, we already use '-d 31' by default now.
I am waiting for your testing result. Thanks!
Comment 14 Cong Wang 2010-08-11 06:16:41 EDT
I built a test package:
https://brewweb.devel.redhat.com/taskinfo?taskID=2674836
Comment 17 George Beshers 2010-08-18 10:17:37 EDT
The makedumpfile command worked with our modified kexec based on 2.0.1.
However, the modified kexec did not work.

I am currently on my third patch to try to fix the problem.

George
Comment 18 George Beshers 2010-08-19 14:38:47 EDT
To clarify the situation.

I asked another SGI engineer for help with this patch.
The patch does work, but against a later version of the
kexec-tools from upstream.

It was my mistake to pass the patch along without
personally testing it.  I worked this last weekend
to try to fix the patch.

George
Comment 19 Linda Wang 2010-08-19 22:28:41 EDT
thank you testing the package, and for the clarification..

so, which patch(es) from upstream kexec-tools is missing
other than the two patches listed in comment#11 above?
Comment 20 George Beshers 2010-08-20 08:37:49 EDT
I have requested help from another SGI engineer with this
and will be careful to test the patched rpm on the 1024 core
5Tb machine that we make available to RedHat on a weekly basis.

George
Comment 21 Marizol Martinez 2010-08-20 09:13:53 EDT
George -- I believe Linda's Q on comment #20 is still outstanding. Could you please update this BZ with the specific patches the upstream version has vs. RH's? Thanks!
Comment 22 George Beshers 2011-02-25 16:37:55 EST

We finally found the problem with kexec-tools and the e820
table -- it manifested itself as a memory corruption in
the running kernel.

I am currently cleaning up the patchset -- the last patch
is upstream.

George
Comment 24 George Beshers 2011-03-08 16:18:23 EST
Created attachment 483023 [details]
Tar bz2 file of patches and a series file.

Up to a few comment cleanups this is what was built

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=3164707

I have verified that this works on a number of UV systems.

The filo is a bzip2 tar file of a quilt patches directory

George
Comment 25 Cong Wang 2011-03-15 05:02:32 EDT
Ok, finally I get the tar ball. One question, are all these patches in upstream?

And I do appreciate that your patches attached are against latest RHEL-6 kexec-tools, this would save me much time to handle conflicts. Anyway, I will try to see if this is true. :)

Thanks.
Comment 26 gbeshers 2011-03-15 13:15:21 EDT
Hi Amerigo,

I added Cliff Wickman to the CC list.  He indicated that
they all were and I found most of them.  A few had been
partially applied and IIRC one I was unsure about because
some of the code had been rewritten and moved.

Let me know when you are ready to test and I will grab
a big system.

George
Comment 28 Cong Wang 2011-03-16 08:23:22 EDT
Thanks, George.

There are some problems from my eyes:

1. Not all commits matches in your patchset description, e.g. in kexec_segs_ranges,

Backport of commit 563ee341d950f2fae0ba6608d70c19eb647ff943
and commit 7b325f8528d230e50a0c3841a3ac587dea2200e2
just for crashdump-x86_64.c which doesn't exist upstream.

Neither of them matches that patch.

2. For 100823.kcore_header_patch, probably we need to backport my patch

commit 1100580b05e3fdfe648d9be8617d962b11f4b88b
Author: Amerigo Wang <amwang@redhat.com>
Date:   Thu Mar 3 00:10:43 2011 +0800

    get the backup area dynamically

Anyway, I will build a kexec-tools package with all of your patches except 100823.kcore_header_patch, plus the backport of 1100580b05e3fdfe648d9be8617d962b11f4b88b for you to test.
Comment 29 Cong Wang 2011-03-16 09:04:00 EDT
George, please help to test this one:
https://brewweb.devel.redhat.com/taskinfo?taskID=3181998

Thanks!
Comment 30 Cong Wang 2011-03-16 09:20:50 EDT
Hmm, please use this one instead:
https://brewweb.devel.redhat.com/taskinfo?taskID=3182054
Comment 31 George Beshers 2011-03-18 15:52:07 EDT
Hi Amerigo,

Interestingly enough if I take the x86_64 rpm that fails,
but if I rebuild the source rpm on the system I am testing
(I was trying to locate the problem) then it does work.

Possibly a problem with the Brew root?

George
Comment 32 Cong Wang 2011-03-21 01:47:37 EDT
Oh, maybe, I made the srpm locally and send it to brew to build. Anyway, I take all the patches. Please try

https://brewweb.devel.redhat.com/buildinfo?buildID=159954

to see if this rpm is okay.

Thanks.
Comment 35 gbeshers 2011-03-23 09:35:15 EDT
Seems to be OK, but I haven't tested on
a 2 rack system yet.

George
Comment 36 errata-xmlrpc 2011-05-19 10:15:15 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0736.html

Note You need to log in before you can comment on or make changes to this bug.