| Summary: | crash fails to analyse vmcore on mustang because makedumpfile filters/excludes required pages | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Emma Wu <xiawu> | |
| Component: | kexec-tools | Assignee: | Pratyush Anand <panand> | |
| Status: | CLOSED DUPLICATE | QA Contact: | Kernel General QE <kernel-general-qe> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 7.3 | CC: | anderson, panand, qzhao | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1369808 (view as bug list) | Environment: | ||
| Last Closed: | 2016-08-24 15:17:49 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1369808 | |||
|
Comment 2
Dave Anderson
2016-08-23 14:09:16 UTC
I reconfigured /etc/kdump.conf from -d31 to -d1, which will only exclude zero-filled pages: [root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:22:18]# grep core_collector /etc/kdump.conf | grep makedumpfile core_collector makedumpfile -l --message-level 1 -d 1 [root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:22:18]# The resultant vmcore was a much more reasonable size of 160MB: root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:22:18]# du -sh vmcore 160M vmcore [root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:22:18]# And the crash utility comes up with no problem: [root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:22:18]# crash /usr/lib/debug/lib/modules/4.5.0-4.el7.aarch64/vmlinux vmcore crash 7.1.5-1.el7 Copyright (C) 2002-2016 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 7.6 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "aarch64-unknown-linux-gnu"... KERNEL: /usr/lib/debug/lib/modules/4.5.0-4.el7.aarch64/vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 8 [OFFLINE: 7] DATE: Tue Aug 23 10:21:58 2016 UPTIME: 00:04:57 LOAD AVERAGE: 0.01, 0.07, 0.05 TASKS: 199 NODENAME: apm-mustang-ev3-06.lab.eng.rdu.redhat.com RELEASE: 4.5.0-4.el7.aarch64 VERSION: #1 SMP Fri Aug 19 08:40:01 EDT 2016 MACHINE: aarch64 (unknown Mhz) MEMORY: 8 GB PANIC: "sysrq: SysRq : Trigger a crash" PID: 2427 COMMAND: "bash" TASK: fffffe00be730e00 [THREAD_INFO: fffffe0155f38000] CPU: 1 STATE: TASK_RUNNING (SYSRQ) crash> So there is a problem with makedumpfile's filtering of pages from the vmcore. This time, with makedumpfile -d9, filtering zero-pages and user pages,
faulty filtering can be seen:
[root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:50:50]# crash -d1 vmcore | grep dump_level
dump_level: 9 (0x9) (DUMP_EXCLUDE_ZERO|DUMP_EXCLUDE_USER_DATA)
[root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:50:50]# du -sh vmcore
118M vmcore
[root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-14:50:50]# crash /usr/lib/debug/lib/modules/4.5.0-4.el7.aarch64/vmlinux vmcore
crash 7.1.5-1.el7
Copyright (C) 2002-2016 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
please wait... (gathering task table data)
crash: page excluded: kernel virtual address: fffffe015c0d0000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c110000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe00d41d8600 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d6800 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f3a400 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c714000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0155cb5900 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00bc393b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f33b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c394000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c477700 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00bc390000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00d41d9500 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f7700 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0db300 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015b8d1d00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f3d100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c47d100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d3b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00bca90000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0155cb0e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f3b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c6f0000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c474a00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156cb5900 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c118000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe00d41db300 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155cbd100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d8600 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156cb6800 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c71c000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0156a9c200 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c39c000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c479500 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d0e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0154598000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0155f36800 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155b94000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c470e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f9500 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0dd100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156cb0000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c471d00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d5900 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f39500 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156cbc200 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f0e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c710000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0155cb4a00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156cb3b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f35900 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c390000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015e118000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c476800 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f4a00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156cbd100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00bc397700 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562fd100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0da400 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f3c200 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d2c00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c6f0e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0def00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f32c00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c473b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c114000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0154594000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0155f34a00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156a9ef00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155cbb300 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f0000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00bf614000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c0d7700 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe00bca9c000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe00bc610000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c718000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe00bc396800 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c47ef00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c398000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0156a93b00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c478600 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562fa400 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c470000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155c10000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe00d41d0e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155b7c200 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562fb300 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0dc200 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155d18000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe01562f8600 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d4a00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f38600 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156a9b300 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f3ef00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0156a9e000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155b98000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe015c475900 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c11c000 type: "fill_thread_info"
crash: page excluded: kernel virtual address: fffffe0155f30e00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe01562f6800 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d9500 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f3b300 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0d1d00 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155f37700 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe0155b7d100 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c0de000 type: "fill_task_struct"
crash: page excluded: kernel virtual address: fffffe015c472c00 type: "fill_task_struct"
please wait... (determining panic task)
WARNING: active task fffffe015c0cff00 on cpu 7 not found in PID hash
crash: page excluded: kernel virtual address: fffffe015c0d0000 type: "fill_task_struct"
KERNEL: /usr/lib/debug/lib/modules/4.5.0-4.el7.aarch64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 8 [OFFLINE: 7]
DATE: Tue Aug 23 10:50:30 2016
UPTIME: 00:02:24
LOAD AVERAGE: 0.11, 0.11, 0.05
TASKS: 115
NODENAME: apm-mustang-ev3-06.lab.eng.rdu.redhat.com
RELEASE: 4.5.0-4.el7.aarch64
VERSION: #1 SMP Fri Aug 19 08:40:01 EDT 2016
MACHINE: aarch64 (unknown Mhz)
MEMORY: 8 GB
PANIC: "sysrq: SysRq : Trigger a crash"
PID: 2434
COMMAND: "bash"
TASK: fffffe0155f2d200 [THREAD_INFO: fffffe015e144000]
CPU: 1
STATE: TASK_RUNNING (SYSRQ)
crash>
Pranand,
One last test, this time with makedumpfile -d7, filtering zero-pages,
private-cache and non-private-cache pages, crash comes up OK:
[root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-15:06:18]# crash -d1 vmcore | grep dump_level
dump_level: 7 (0x7) (DUMP_EXCLUDE_ZERO|DUMP_EXCLUDE_CACHE|DUMP_EXCLUDE_CACHE_PRI)
[root@apm-mustang-ev3-06 127.0.0.1-2016-08-23-15:06:18]# crash /usr/lib/debug/lib/modules/4.5.0-4.el7.aarch64/vmlinux vmcore
crash 7.1.5-1.el7
Copyright (C) 2002-2016 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/4.5.0-4.el7.aarch64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 8 [OFFLINE: 7]
DATE: Tue Aug 23 11:05:58 2016
UPTIME: 00:00:55
LOAD AVERAGE: 0.56, 0.18, 0.06
TASKS: 198
NODENAME: apm-mustang-ev3-06.lab.eng.rdu.redhat.com
RELEASE: 4.5.0-4.el7.aarch64
VERSION: #1 SMP Fri Aug 19 08:40:01 EDT 2016
MACHINE: aarch64 (unknown Mhz)
MEMORY: 8 GB
PANIC: "sysrq: SysRq : Trigger a crash"
PID: 2437
COMMAND: "bash"
TASK: fffffe0155cc1e00 [THREAD_INFO: fffffe015b7f0000]
CPU: 1
STATE: TASK_RUNNING (SYSRQ)
crash>
So it looks to be a problem recognizing user-space pages.
I would guess that the problem has to do with the pages being
inadvertently marked as hugetlbfs pages here:
/*
* Exclude the data page of the user process.
* - anonymous pages
* - hugetlbfs pages
*/
else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
&& (isAnon(mapping) || isHugetlb(compound_dtor))) {
pfn_counter = &pfn_user;
}
Note that the 4.5.0-4.el7 kernel does have these components in its
page structure:
crash> struct page
... [ cut ] ...
struct {
unsigned long compound_head;
unsigned int compound_dtor;
unsigned int compound_order;
};
...
But unlike upstream kernels, which has this segment in its
crash_save_vmcoreinfo_init() function:
...
VMCOREINFO_OFFSET(page, _mapcount);
VMCOREINFO_OFFSET(page, private);
+ VMCOREINFO_OFFSET(page, compound_dtor);
+ VMCOREINFO_OFFSET(page, compound_order);
+ VMCOREINFO_OFFSET(page, compound_head);
VMCOREINFO_OFFSET(pglist_data, node_zones);
VMCOREINFO_OFFSET(pglist_data, nr_zones);
#ifdef CONFIG_FLAT_NODE_MEM_MAP
VMCOREINFO_OFFSET(pglist_data, node_mem_map);
#endif
...
The 4.5.0-4.el7 kernel does not save the compound_dtor,
compound_order and compound_head offsets:
...
VMCOREINFO_OFFSET(page, _mapcount);
VMCOREINFO_OFFSET(page, private);
VMCOREINFO_OFFSET(pglist_data, node_zones);
VMCOREINFO_OFFSET(pglist_data, nr_zones);
#ifdef CONFIG_FLAT_NODE_MEM_MAP
VMCOREINFO_OFFSET(pglist_data, node_mem_map);
#endif
...
So it seems likely that the isHugetlb(compound_dtor) check
in makedumpfile.c may be mistakenly returning TRUE.
(In reply to Dave Anderson from comment #7) > I would guess that the problem has to do with the pages being > inadvertently marked as hugetlbfs pages here: Thanks for digging it out. Yes, we need following two commits in RHELSA kernel. d7f53518f713 kexec: export OFFSET(page.compound_head) to find out compound tail page 8639a847b0e1 kexec: update VMCOREINFO for compound_order/dtor I will clone this bz for kernel component and will send patches for that. *** This bug has been marked as a duplicate of bug 1369808 *** Hi Pratysh, I think we don't need to CLOSE this bug as a duplicate, just change component & sub component also can finish this work. (i understand that bz1369313 and bz1369808 is the same problem.) However doesn't matter. This is just my personal thought. -- Thanks, Qiao (In reply to Qiao Zhao from comment #10) > Hi Pratysh, > > I think we don't need to CLOSE this bug as a duplicate, just change > component & sub component also can finish this work. (i understand that > bz1369313 and bz1369808 is the same problem.) > However doesn't matter. This is just my personal thought. I thought, it does not allow to change component. But I noticed, "Click to list all components" and we can change that. I agree with your view. I will take care in the future. |