| Summary: | [RHEL6.1] Kdump kernel panics on specific host running 2.6.32-131.0.10.el6 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | PaulB <pbunyan> | ||||||
| Component: | kernel | Assignee: | Vivek Goyal <vgoyal> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 6.1 | CC: | amwang, anderson, arozansk, dzickus, jburke, jstancek, lwoodman, nhorman, pbunyan | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-07-29 15:11:29 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
PaulB
2011-04-29 19:04:56 UTC
Larry, Does this sound like some sort of free pages list corruption in kdump kenrel Paul, is this bug reproducible > BUG: unable to handle kernel paging request at ffffea0000deddb0
The crash happens when the page structure at ffffea0000deddb0
is accessed by the list_del(&page->lru) call below:
static inline
struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
int migratetype)
{
unsigned int current_order;
struct free_area * area;
struct page *page;
/* Find a page of the appropriate size in the preferred list */
for (current_order = order; current_order < MAX_ORDER; ++current_order) {
area = &(zone->free_area[current_order]);
if (list_empty(&area->free_list[migratetype]))
continue;
page = list_entry(area->free_list[migratetype].next,
struct page, lru);
list_del(&page->lru);
rmv_page_order(page);
area->nr_free--;
expand(zone, page, order, current_order, area, migratetype);
return page;
}
return NULL;
}
It's vmemmap'd page structure address, where the vmmemmap region
starts at ffffea0000000000, so ffffea0000deddb0 would reference
the page at 0xdeddb0/sizeof(struct page):
crash> eval deddb0 / 56
hexadecimal: 3fad0
decimal: 260816
octal: 775320
binary: 0000000000000000000000000000000000000000000000111111101011010000
crash> eval 260816 * 4k
hexadecimal: 3fad0000 (1043264KB)
decimal: 1068302336
octal: 7753200000
binary: 0000000000000000000000000000000000111111101011010000000000000000
crash> eval 3fad0000 / 1m
hexadecimal: 3fa
decimal: 1018
octal: 1772
binary: 0000000000000000000000000000000000000000000000000000001111111010
crash>
And physical address 3fad0000 is almost at the 1GB physical address.
I'm not clear on what all the "memmap=" parameters are for, but
the one that reads memmap=261488K@33404K would seemingly imply
that it was crashkernel=256M@32M. So there shouldn't be any
pages up at the 1GB region AFAIK.
But I may be completely wrong, because I don't know that all of the
other "memmap=" are there for. Are they also regions that are used
by the second kernel?
Vivek, See bottom of comment #1 for reproducer. -pbunyan (In reply to comment #5) > And physical address 3fad0000 is almost at the 1GB physical address. > > I'm not clear on what all the "memmap=" parameters are for, but > the one that reads memmap=261488K@33404K would seemingly imply > that it was crashkernel=256M@32M. So there shouldn't be any > pages up at the 1GB region AFAIK. > > But I may be completely wrong, because I don't know that all of the > other "memmap=" are there for. Are they also regions that are used > by the second kernel? Dave, you are right that it looks like we used 256M@32M. It is interesting that we are trying to access physical page at 1GB, which is not even present in memmap passed to second kernel. BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fe8cc00 (usable) BIOS-e820: 000000003fe8cc00 - 000000003fe8ec00 (ACPI NVS) BIOS-e820: 000000003fe8ec00 - 000000003fe90c00 (ACPI data) BIOS-e820: 000000003fe90c00 - 0000000040000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved) BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) last_pfn = 0x3fe8c max_arch_pfn = 0x400000000 user-defined physical RAM map: user: 0000000000000000 - 0000000000001000 (reserved) user: 0000000000001000 - 00000000000a0000 (usable) user: 00000000000f0000 - 0000000000100000 (reserved) user: 000000000209f000 - 0000000011ffb000 (usable) user: 000000003fe8cc00 - 000000003fe90c00 (ACPI data) user: 000000003fe90c00 - 0000000040000000 (reserved) user: 00000000e0000000 - 00000000f0000000 (reserved) user: 00000000fec00000 - 00000000fed00400 (reserved) user: 00000000fed20000 - 00000000feda0000 (reserved) user: 00000000fee00000 - 00000000fef00000 (reserved) user: 00000000ffb00000 - 0000000100000000 (reserved) Notice that BIOS memory map says that there is physical memory at 1GB. We should have marked it as reserved (by kexec-tools) in user defined memory map but it does not seem to be the case. So that sounds little bit fishy. I see following memmap entries which ask second kernel to mark some ranges as reserved. memmap=1469K$1047107K memmap=262144K$3670016K memmap=1025K$4173824K memmap=512K$4174976K memmap=1024K$4175872K memmap=5120K$4189184K So we have not asked second kernel to mark some memory as reserved. That's why it is not marked as reserved. It might not necessarily be a bug because Neil mentioned that marking some ranges as reserved was introduced primarily because some ACPI data or some other data was present there in some HP systems. So generally in regular RAM one would not put that data. So while it might be desirable to fix it for regular RAM also, but it might not necessarily be a bug. The bigger question first would be to figure out why are we putting a page out of reserved region in free list in second kernel. > Dave, you are right that it looks like we used 256M@32M
I wonder if it's reproducible on the primary kernel by passing "mem=288M"
on the boot command line?
(In reply to comment #8) > I wonder if it's reproducible on the primary kernel by passing "mem=288M" > on the boot command line? I booted first kernel with mem=256M and it boots fine. So reserving 256MB does not seem to be an issue. I am trying to reproduce the issue but it seems to be failing in different ways. This time it failed because it can't find root in second kernel. Initalizing network drop monitor service md: Waiting for all devices to be available before autodetect md: If you don't use raid, use raid=noautodetect md: Autodetecting RAID arrays. md: Scanned 0 and added 0 devices. md: autorun ... md: ... autorun DONE. VFS: Cannot open root device "mapper/vg_dellpesc142001-lv_root" or unknown-block(0,0) Please append a correct "root=" boot option; here are the available partitions: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) Pid: 1, comm: swapper Not tainted 2.6.32-131.0.12.el6.i686 #1 Call Trace: [<c2821fde>] ? panic+0x42/0xf9 [<c2a60dba>] ? mount_block_root+0x1ce/0x263 [<c2a60ff4>] ? prepare_namespace+0x14b/0x191 [<c25265ef>] ? sys_access+0x1f/0x30 [<c2a6043b>] ? kernel_init+0x227/0x235 [<c2a60214>] ? kernel_init+0x0/0x235 [<c240a03f>] ? kernel_thread_helper+0x7/0x10 The original issue was reported against 2.6.32-131.0.10.el6.x86_64 I would recommend staying with the kernel and arch that this issue was reported against. By changing both the kernel and arch you may be hitting a new issue that we have not seen yet. (In reply to comment #7) > > Notice that BIOS memory map says that there is physical memory at 1GB. We > should have marked it as reserved (by kexec-tools) in user defined memory > map but it does not seem to be the case. So that sounds little bit fishy. Yes, this sounds like a bug. > > I see following memmap entries which ask second kernel to mark some ranges as > reserved. > > memmap=1469K$1047107K memmap=262144K$3670016K memmap=1025K$4173824K > memmap=512K$4174976K memmap=1024K$4175872K memmap=5120K$4189184K > > So we have not asked second kernel to mark some memory as reserved. That's why > it is not marked as reserved. I would like to see some debugging information on that machine, please enable DEBUG (-DDEBUG) and recompile kexec-tools srpm, let's see what we will get. > > The bigger question first would be to figure out why are we putting a page out > of reserved region in free list in second kernel. This is interesting, I think the kernel might consider these memory as RAM? otherwise we don't have this bug, right? (In reply to comment #12) > > The bigger question first would be to figure out why are we putting a page out > > of reserved region in free list in second kernel. > > This is interesting, I think the kernel might consider these memory as RAM? > otherwise we don't have this bug, right? Er, we use memmap=exactmap in the second kernel, so the kernel should only use what we specified via memmap=. This bug seems to be duplicate of 690301. There also we see some vm data structure corruption and see a trace [<ffffffff814de462>] ? do_general_protection+0x152/0x160 [<ffffffff814ddc35>] ? general_protection+0x25/0x30 [<ffffffff81272f20>] ? list_del+0x10/0xa0 [<ffffffff8111cc85>] ? __rmqueue+0xc5/0x490 [<ffffffff8111eb08>] ? get_page_from_freelist+0x598/0x820 I am not closing it a duplicate of that bz yet. Yes, probably. I am still trying to get some debugging info on that machine. Ok, freshly installed this system and reproduced the issue very first time I tried it. I had reserved 128MB physical memory at 32MB physical address. That means second kernel should not have accessed a physical memory beyond 160MB. Uploading the dmesg. Created attachment 500661 [details]
console logs of the crash of second kernel
Did another test where I enabled "bootmem_debug" and "debug" kernel command line options in kdump kernel. Looking at bootmem debug output it looks like that bootmem allocator released right amount of memory. There are also two WARN() messages in __list_add() which tell that some list is corrupted. I think bootmem allocator did its job right. It is later when things got corrupted. (While freeing some slab/slub cache etc). Attaching the boot log Created attachment 500676 [details]
boot logs with "bootmem_debug" and "debug" kernel command line options
So to me it looks like that some how freelist got corrupted. How did we reach there, no clue yet. Vivek, that still doesn't address Dave's concern, can you add "mminit_loglevel=4" too? |