Bug 1756206
Summary: | [Intel 8.2 Feature] Crystal Ridge - Sub-section memory hotplug support [kexec-tools part] | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Baoquan He <bhe> |
Component: | kexec-tools | Assignee: | Baoquan He <bhe> |
Status: | CLOSED ERRATA | QA Contact: | Emma Wu <xiawu> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.2 | CC: | anderson, ruyang, xiawu |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | 8.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | kexec-tools-2.0.20-7.el8 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-04-28 16:43:23 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1724969, 1732733 |
Description
Baoquan He
2019-09-27 05:18:56 UTC
> Description of problem: > kernel commit 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY > flag") added the flag to mem_section->section_mem_map value, and it caused > makedumpfile an error like the following: > > readmem: Can't convert a virtual address(fffffc97d1000000) to physical address. > readmem: type_addr: 0, addr:fffffc97d1000000, size:32768 > __exclude_unnecessary_pages: Can't read the buffer of struct page. > create_2nd_bitmap: Can't exclude unnecessary pages. > > Upstream makedumpfile has fixed it with commit: > commit 7bdb468c2c99 ("[PATCH] Increase SECTION_MAP_LAST_BIT to 4") > > The above kernel commit 326e1b8f83a4 will be back ported to rhel8.2 > kernel, so the makedumpfile commit need be back ported. Otherwise > vmcore dumping will fail. Your tests were run with crash-7.2.6-2.el8: > crash 7.2.6-2.el8 ... > crash> bt -F > PID: 6544 TASK: c0000000fd0d4900 CPU: 5 COMMAND: "runtest.sh" > #0 [c0000007f3a8f8b0] crash_kexec at c000000000251cd0 > c0000007f3a8f8b0: c0000007f3a8f8f0 0000000000000000 > c0000007f3a8f8c0: crash_kexec+128 c0000007f3a8fa50 > c0000007f3a8f8d0: c0000007f3a8f8f0 0000000000000000 > c0000007f3a8f8e0: 000000000000000b c0000007f3a8fa50 > #1 [c0000007f3a8f8f0] oops_end at c000000000029c78 > c0000007f3a8f8f0: c0000007f3a8f970 0000000000000005 > c0000007f3a8f900: oops_end+392 000000000000000b > c0000007f3a8f910: c0000007f3a8f970 0000000000000001 > c0000007f3a8f920: bt: invalid kernel virtual address: ffffffff01fe0118 type: "page.slab" > > crash> kmem -s > CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME > kmem: page_to_nid: invalid page: f000000001c9bb00 > kmem: kmalloc-8k(593:restraintd.service): cannot gather relevant slab data > c0000007f8591800 8192 ? ? ? 256k kmalloc-8k(593:restraintd.service) > c0000007dd09de00 16384 0 0 0 512k kmalloc-16k(593:restraintd.service) > kmem: page_to_nid: invalid page: f000000001c99580 > kmem: kmalloc-8(593:restraintd.service): cannot gather relevant slab data > c0000007f8592400 8 ? ? ? 64k kmalloc-8(593:restraintd.service) > kmem: page_to_nid: invalid page: f000000001f99d40 > kmem: pde_opener(593:restraintd.service): cannot gather relevant slab data > c0000007f76c6600 40 ? ? ? 64k pde_opener(593:restraintd.service) > kmem: page_to_nid: invalid page: f000000001bb1800 > kmem: kmalloc-4k(593:restraintd.service): cannot gather relevant slab data > c0000007f8596000 4096 ? ? ? 128k kmalloc-4k(593:restraintd.service) > kmem: page_to_nid: invalid page: f000000001fdaa40 > kmem: kmalloc-32(593:restraintd.service): cannot gather relevant slab data > c0000007dd097200 32 ? ? ? 64k kmalloc-32(593:restraintd.service) > kmem: page_to_nid: invalid page: f000000001efcc00 > kmem: kmalloc-rcl-128(593:restraintd.service): cannot gather relevant slab data > ... Support for the Linux 5.3-rc1 SECTION_IS_EARLY bit was addressed in crash-7.2.7 in this patch from Kazuhito Hagio, although the error symptoms he described were different: commit e1df72964f8a583000e6cb74e54f8efbab6721ac Author: Dave Anderson <anderson> Date: Fri Jul 26 14:31:33 2019 -0400 Fix for the "kmem -n" option on Linux 5.3-rc1 and later kernels that contain commit 326e1b8f83a4318b09033ef754f40c785aed5e68, titled "mm/sparsemem: introduce a SECTION_IS_EARLY flag". Without the patch, mem_map addresses containing the flag in bit 3 incorrectly show it as part of the virtual address; with the patch, the option displays the new "E" state flag. (k-hagio.nec.com) Here is crash-7.2.7.el8 running the two commands above on the supplied vmcore from comment #3: crash> bt -F PID: 6544 TASK: c0000000fd0d4900 CPU: 5 COMMAND: "runtest.sh" #0 [c0000007f3a8f8b0] crash_kexec at c000000000251cd0 c0000007f3a8f8b0: [thread_stack(593:restraintd.service)] 0000000000000000 c0000007f3a8f8c0: crash_kexec+128 [thread_stack(593:restraintd.service)] c0000007f3a8f8d0: [thread_stack(593:restraintd.service)] 0000000000000000 c0000007f3a8f8e0: 000000000000000b [thread_stack(593:restraintd.service)] #1 [c0000007f3a8f8f0] oops_end at c000000000029c78 c0000007f3a8f8f0: [thread_stack(593:restraintd.service)] 0000000000000005 c0000007f3a8f900: oops_end+392 000000000000000b c0000007f3a8f910: [thread_stack(593:restraintd.service)] 0000000000000001 c0000007f3a8f920: [kmalloc-4k] 0000000000000f7f c0000007f3a8f930: [thread_stack(593:restraintd.service)] nvram_pstore_info+16 c0000007f3a8f940: 0000000000000000 0000000000000000 c0000007f3a8f950: 0000000000000000 0000000000000063 c0000007f3a8f960: 000000000000000b [thread_stack(593:restraintd.service)] #2 [c0000007f3a8f970] bad_page_fault at c00000000007bb6c c0000007f3a8f970: [thread_stack(593:restraintd.service)] 0000000000000007 c0000007f3a8f980: bad_page_fault+268 0000000000000063 c0000007f3a8f990: [thread_stack(593:restraintd.service)] sysrq_handle_crash+40 c0000007f3a8f9a0: 0000000000000000 00126a8dfa540c7f c0000007f3a8f9b0: 0000000000000166 0000000000000007 c0000007f3a8f9c0: 0000000000000007 0000000000000001 c0000007f3a8f9d0: suppress_printk console_printk #3 [c0000007f3a8f9e0] handle_page_fault at c00000000000a720 c0000007f3a8f9e0: [thread_stack(593:restraintd.service)] c000000028222288 c0000007f3a8f9f0: handle_page_fault+52 [selinux_inode_security] c0000007f3a8fa00: msg_print_text+216 00000001109ad558 c0000007f3a8fa10: [thread_stack(593:restraintd.service)] [proc_dir_entry] c0000007f3a8fa20: 0000000000000117 000000000000000f c0000007f3a8fa30: [thread_stack(593:restraintd.service)] 0044b82fa09b5a53 c0000007f3a8fa40: 7265677368657265 20c49ba5e353f7cf c0000007f3a8fa50: __handle_sysrq+228 [thread_stack(593:restraintd.service)] c0000007f3a8fa60: .TOC. 0000000000000063 c0000007f3a8fa70: c0000007ffc0cf90 c0000007ffc94668 c0000007f3a8fa80: 00126a8dfa53faec 0000000000000165 c0000007f3a8fa90: 0000000000000007 0000000000000001 c0000007f3a8faa0: 0000000000000000 0000000000000000 c0000007f3a8fab0: sysrq_handle_crash c000000007fa8a00 c0000007f3a8fac0: 0000000040000000 00000001109a9788 c0000007f3a8fad0: 00000001109a9714 0000000110946638 c0000007f3a8fae0: 00000001108def10 00000001109ad558 c0000007f3a8faf0: 00000100185059d0 0000000000000001 c0000007f3a8fb00: 0000000110959370 00007ffffc355324 c0000007f3a8fb10: 00007ffffc355320 sysrq_crash_op c0000007f3a8fb20: 0000000000000000 0000000000000007 c0000007f3a8fb30: 0000000000000000 0000000000000063 c0000007f3a8fb40: suppress_printk console_printk c0000007f3a8fb50: sysrq_handle_crash+40 8000000000009033 c0000007f3a8fb60: slb_miss_common+228 sysrq_handle_crash c0000007f3a8fb70: __handle_sysrq+228 000000000000000f c0000007f3a8fb80: 0000000028222282 0000000000000000 c0000007f3a8fb90: 0000000000000300 0000000000000000 c0000007f3a8fba0: 0000000042000000 0000000000000000 c0000007f3a8fbb0: [thread_stack(593:restraintd.service)] console_sem c0000007f3a8fbc0: 0000000000000001 0000000000000000 c0000007f3a8fbd0: [thread_stack(593:restraintd.service)] 000000000000000f c0000007f3a8fbe0: irq_work_queue+156 log_buf_len c0000007f3a8fbf0: [thread_stack(593:restraintd.service)] 0000000000002000 c0000007f3a8fc00: vprintk_emit+416 000000000000002c c0000007f3a8fc10: 0000000110959370 00007ffffc355324 c0000007f3a8fc20: 00007ffffc355320 sysrq_crash_op c0000007f3a8fc30: 0000000000000000 0000000000000007 c0000007f3a8fc40: 0000000000000000 kallsyms_token_index+13840 c0000007f3a8fc50: c0000000011ccfa0 c0000000011d0fa0 c0000007f3a8fc60: [thread_stack(593:restraintd.service)] 0000000000002000 c0000007f3a8fc70: vprintk_func+116 [thread_stack(593:restraintd.service)] c0000007f3a8fc80: [thread_stack(593:restraintd.service)] 0000000000000100 c0000007f3a8fc90: [thread_stack(593:restraintd.service)] 0000000000000063 c0000007f3a8fca0: suppress_printk console_printk c0000007f3a8fcb0: [thread_stack(593:restraintd.service)] [dentry(465:systemd-hostnamed.service)] c0000007f3a8fcc0: printk+64 [names_cache] Data Access [300] exception frame: R0: c00000000083be84 R1: c0000007f3a8fcd0 R2: c000000001717300 R3: 0000000000000063 R4: c0000007ffc0cf90 R5: c0000007ffc94668 R6: 00126a8dfa53faec R7: 0000000000000165 R8: 0000000000000007 R9: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: c00000000083aeb0 R13: c000000007fa8a00 R14: 0000000040000000 R15: 00000001109a9788 R16: 00000001109a9714 R17: 0000000110946638 R18: 00000001108def10 R19: 00000001109ad558 R20: 00000100185059d0 R21: 0000000000000001 R22: 0000000110959370 R23: 00007ffffc355324 R24: 00007ffffc355320 R25: c00000000161aad8 R26: 0000000000000000 R27: 0000000000000007 R28: 0000000000000000 R29: 0000000000000063 R30: c000000001752374 R31: c0000000015c4b18 NIP: c00000000083aed8 MSR: 8000000000009033 OR3: c000000000008934 CTR: c00000000083aeb0 LR: c00000000083be84 XER: 000000000000000f CCR: 0000000028222282 MQ: 0000000000000000 DAR: 0000000000000000 DSISR: 0000000042000000 Syscall Result: 0000000000000000 [NIP : sysrq_handle_crash+40] [LR : __handle_sysrq+228] #4 [c0000007f3a8fcd0] sysrq_handle_crash at c00000000083aed8 c0000007f3a8fcd0: [thread_stack(593:restraintd.service)] [dentry(97:sysroot.mount)] c0000007f3a8fce0: __handle_sysrq+200 .TOC. c0000007f3a8fcf0: [thread_stack(593:restraintd.service)] kallsyms_token_index+576416 c0000007f3a8fd00: 000000000000000f 0000000053203a71 c0000007f3a8fd10: textbuf.49030+2 00000007fea40000 c0000007f3a8fd20: sysrq_handler+24 moom_work c0000007f3a8fd30: [thread_stack(593:restraintd.service)] 00000001109aaf84 c0000007f3a8fd40: 0000000000000002 000001001848e8c0 c0000007f3a8fd50: 00000000fb06c200 0000000000000002 c0000007f3a8fd60: fffffffffffffffb 0000000000000002 #5 [c0000007f3a8fd70] write_sysrq_trigger at c00000000083c608 c0000007f3a8fd70: [thread_stack(593:restraintd.service)] selinux_hooks+2360 c0000007f3a8fd80: write_sysrq_trigger+104 0000000000000000 c0000007f3a8fd90: [thread_stack(593:restraintd.service)] [proc_dir_entry] #6 [c0000007f3a8fda0] proc_reg_write at c0000000005b3aa4 c0000007f3a8fda0: [thread_stack(593:restraintd.service)] 000000000026e502 c0000007f3a8fdb0: proc_reg_write+132 .TOC. c0000007f3a8fdc0: 0000000000000000 [cred_jar(593:restraintd.service)] #7 [c0000007f3a8fdd0] sys_write at c0000000004d9738 c0000007f3a8fdd0: [thread_stack(593:restraintd.service)] 00007fffaaf21858 c0000007f3a8fde0: sys_write+296 .TOC. c0000007f3a8fdf0: 0000000000000000 00000100184ff060 c0000007f3a8fe00: do_syscall_trace_enter+404 000001001848e8c0 c0000007f3a8fe10: 0000000000000002 00007fffaaf21858 c0000007f3a8fe20: 000001001848e8c0 0000000000000002 #8 [c0000007f3a8fe30] system_call at c00000000000b388 System Call [c00] exception frame: R0: 0000000000000004 R1: 00007ffffc355100 R2: 00007fffaaf27300 R3: 0000000000000001 R4: 000001001848e8c0 R5: 0000000000000002 R6: 0000000000000010 R7: 00007fffaad83af4 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 00007fffab03acd0 R14: 0000000040000000 R15: 00000001109a9788 R16: 00000001109a9714 R17: 0000000110946638 R18: 00000001108def10 R19: 00000001109ad558 R20: 00000100185059d0 R21: 0000000000000001 R22: 0000000110959370 R23: 00007ffffc355324 R24: 00007ffffc355320 R25: 00000001109aaf84 R26: 0000000000000002 R27: 000001001848e8c0 R28: 0000000000000002 R29: 00007fffaaf21858 R30: 000001001848e8c0 R31: 0000000000000002 NIP: 00007fffaae380f4 MSR: 800000000000f033 OR3: 0000000000000001 CTR: 0000000000000000 LR: 00007fffaadb28e4 XER: 0000000000000000 CCR: 0000000048222282 MQ: 0000000000000000 DAR: 0000010018488540 DSISR: 000000000a000000 Syscall Result: 0000000000000000 crash> crash> kmem -s CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME c0000007f8591800 8192 0 32 1 256k kmalloc-8k(593:restraintd.service) c0000007dd09de00 16384 0 0 0 512k kmalloc-16k(593:restraintd.service) c0000007f8592400 8 0 8192 1 64k kmalloc-8(593:restraintd.service) c0000007f76c6600 40 0 8190 5 64k pde_opener(593:restraintd.service) c0000007f8596000 4096 0 32 1 128k kmalloc-4k(593:restraintd.service) c0000007dd097200 32 0 10240 5 64k kmalloc-32(593:restraintd.service) c0000007eda8f900 128 20 4096 8 64k kmalloc-rcl-128(593:restraintd.service) c0000007eda85d00 96 29 5456 8 64k kmalloc-rcl-96(593:restraintd.service) c0000000ff2e9000 752 0 258 3 64k shmem_inode_cache(593:restraintd.service) c0000007dac21b00 64 4357 12288 12 64k kmalloc-rcl-64(593:restraintd.service) c0000007dd09ae00 65536 0 64 8 512k kmalloc-64k(593:restraintd.service) c0000007dac2ed00 1112 8 448 8 64k signal_cache(593:restraintd.service) c0000007dac20900 2088 3 240 8 64k sighand_cache(593:restraintd.service) c0000007dac20c00 768 3 680 8 64k files_cache(593:restraintd.service) c0000007dac23600 1024 1 512 8 64k kmalloc-1k(593:restraintd.service) c0000007dac2bd00 192 1 2728 8 64k kmalloc-192(593:restraintd.service) c0000007dd091800 752 0 0 0 64k shmem_inode_cache(586:crond.service) c0000007dd09cf00 80 0 0 0 64k task_delay_info(586:crond.service) c0000007eda8cf00 1664 0 312 8 64k UDPv6(593:restraintd.service) c0000000ff2e6600 1408 0 368 8 64k UDP(593:restraintd.service) c0000007f859b400 80 9 6552 8 64k task_delay_info(593:restraintd.service) c0000007f859b700 16384 9 256 8 512k thread_stack(593:restraintd.service) c0000007f8594200 6016 9 168 8 128k task_struct(593:restraintd.service) c0000007f859c600 2392 0 52 2 64k TCP(593:restraintd.service) c0000007f8596c00 2544 1 200 8 64k TCPv6(593:restraintd.service) c0000007eda83f00 720 0 0 0 64k proc_inode_cache(586:crond.service) c0000007dac28d00 576 3211 3808 34 64k radix_tree_node(593:restraintd.service) ... [ cut ] ... c0000007fc01e400 256 1687 2816 11 64k kmalloc-256 c0000007fc01e700 192 1875 6479 19 64k kmalloc-192 c0000007fc01ea00 128 4316 7680 15 64k kmalloc-128 c0000007fc01ed00 96 3108 9548 14 64k kmalloc-96 c0000007fc01f000 64 52083 58368 57 64k kmalloc-64 c0000007fc01f300 32 438892 446464 218 64k kmalloc-32 c0000007fc01f600 16 114898 135168 33 64k kmalloc-16 c0000007fc01f900 8 382570 442368 54 64k kmalloc-8 c0000007fc01fc00 64 985 8192 8 64k kmem_cache_node c0000007fc010000 744 985 1190 14 64k kmem_cache crash> Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1783 |