Bug 2214235
| Summary: | RHEL-8.6: [4.18.0-448 and early kernels] crashkernel can not reserve memory randomly on AWS aarch64 platform | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Pingfan Liu <piliu> |
| Component: | kernel | Assignee: | Pingfan Liu <piliu> |
| kernel sub component: | Kexec-kdump | QA Contact: | Jie Li <jieli> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | Sujata Kurup <skurup> |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | ruyang, skurup, yiyan |
| Version: | 8.6 | Keywords: | Tracking, Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
.Memory allocation for `kdump` fails on the 64-bit ARM architectures
On certain 64-bit ARM based systems, the firmware uses the non-contiguous memory allocation method, which reserves memory randomly at different scattered locations. Consequently, due to the unavailability of consecutive blocks of memory, the crash kernel cannot reserve memory space for the `kdump` mechanism.
To work around this problem, install the kernel version provided by RHEL 8.8 and later. The latest version of RHEL supports the `fallback` dump capture mechanism that helps to find a suitable memory region in the described scenario.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-07-13 06:28:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Pingfan Liu
2023-06-12 10:20:16 UTC
I open this bug on behavior of Juan Abia <jabia>. Let him supplement more detail later. Description of problem: sometimes when booting aarch64 images (in my particular case, validating CCSP aws images) kdump is not operational. After investigation from Pingfan Liu, we realized this only happens if the image kernel version is lower than 4.18.0-449.el8 Version-Release number of selected component (if applicable): kexec-tools-2.0.20-69.el8_6.1.aarch64 How reproducible: there's a low probability of hitting this bug Steps to Reproduce: 1. Boot an aarch64 image with a kernel version lower than 4.18.0-449 2. run "kdumpctl status" Actual results: kdump: Kdump is not operational Expected results: kdump: Kdump is operational Additional info: journalctl kernel: -- Logs begin at Tue 2023-05-30 07:44:43 UTC, end at Tue 2023-05-30 08:22:40 UTC. -- May 30 07:44:43 localhost kernel: Booting Linux on physical CPU 0x0000000000 [0x413fd0c1] May 30 07:44:43 localhost kernel: Linux version 4.18.0-372.57.1.el8_6.aarch64 (mockbuild.eng.bos.redhat.com) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)) #1 SMP Thu May 11 07:27:41 EDT 2023 May 30 07:44:43 localhost kernel: efi: Getting EFI parameters from FDT: May 30 07:44:43 localhost kernel: efi: EFI v2.70 by EDK II May 30 07:44:43 localhost kernel: efi: SMBIOS=0x7bed0000 SMBIOS 3.0=0x7beb0000 ACPI=0x786e0000 ACPI 2.0=0x786e0014 MEMATTR=0x7a75c018 RNG=0x7bfdef98 MEMRESERVE=0x7857c698 May 30 07:44:43 localhost kernel: efi: seeding entropy pool May 30 07:44:43 localhost kernel: Using crashkernel=auto, the size chosen is a best effort estimation. May 30 07:44:43 localhost kernel: cannot allocate crashkernel (size:0x1c000000) May 30 07:44:43 localhost kernel: ACPI: Early table checksum verification disabled May 30 07:44:43 localhost kernel: ACPI: RSDP 0x00000000786E0014 000024 (v02 AMAZON) This issue is limited to the aarch64 platforma The root cause should be that the firmware allocates and occupies memory randomly at different location. So there is no continuous memory chunk left under 4GB, which is big enough to allocate memory for crashkernel. Beyond the kernel version 4.18.0-449.el8, the crashkernel supports fallback mode, and it can find a suitable region cross or above 4GB boundary. If a user hit this issue, he/she is suggested to update the kernel beyond 4.18.0-449.el8 to tackle it. |