Bug 2253165

Summary:	kdump kernel failed to boot up because a big memory chunk is reserved
Product:	[Fedora] Fedora	Reporter:	Baoquan He <bhe>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	rawhide	CC:	acaringi, adscvr, airlied, alciregi, bskeggs, hdegoede, hpa, jarod, josef, kernel-maint, linville, masami256, mchehab, nixuser, ptalbert, steved
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2024-07-15 03:31:21 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Baoquan He 2023-12-06 10:32:31 UTC

1. Please describe the problem:
CKI reported a failure on beaker machine hp-z210-01.ml3.eng.bos.redhat.com, please see below CKI reports:
https://datawarehouse.cki-project.org/kcidb/tests/10508330

In that failure, crashkernel=256M and succeeded to reserve in 1st kernel. However, in
kdump kernel it failed to boot up when it started to run init process. I set crashkernel=320M to make kdump kernel boot up successfully and vmcore dumping succeeded too.

After adding "rd.memdebug=4 memblock=debug" to kdump kernel cmdline, it appears to have a big chunk of reserved memory in memblock of about 122M. I don't know where it comes from. I doubt firmware stole that chunk from system memory to cause the kdump kernel having oom.


[Tue Dec  5 22:32:38 2023] DMI: Hewlett-Packard HP Z210 Workstation/1587h, BIOS J51 v01.20 09/16/2011
[Tue Dec  5 22:32:38 2023] tsc: Fast TSC calibration using PIT
[Tue Dec  5 22:32:38 2023] tsc: Detected 3092.940 MHz processor
[Tue Dec  5 22:32:38 2023] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[Tue Dec  5 22:32:38 2023] e820: remove [mem 0x000a0000-0x000fffff] usable
[Tue Dec  5 22:32:38 2023] last_pfn = 0x61000 max_arch_pfn = 0x400000000
[Tue Dec  5 22:32:38 2023] MTRR map: 4 entries (3 fixed + 1 variable; max 23), built from 10 variable MTRRs
[Tue Dec  5 22:32:38 2023] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
[Tue Dec  5 22:32:38 2023] x2apic: enabled by BIOS, switching to x2apic ops
[Tue Dec  5 22:32:38 2023] found SMP MP-table at [mem 0x000f4b80-0x000f4b8f]
[Tue Dec  5 22:32:38 2023] memblock_reserve: [0x00000000000f4b80-0x00000000000f4b8f] smp_scan_config+0xca/0x150
[Tue Dec  5 22:32:38 2023] memblock_reserve: [0x00000000000f4b90-0x00000000000f4e4b] smp_scan_config+0x13a/0x150
[Tue Dec  5 22:32:38 2023] memblock_reserve: [0x000000005f600000-0x000000005f610fff] setup_arch+0xd84/0xf10
[Tue Dec  5 22:32:38 2023] memblock_add: [0x0000000000001000-0x000000000008f7ff] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] memblock_add: [0x000000004d0e00b0-0x0000000060ff81cf] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] memblock_add: [0x0000000060ff81d0-0x0000000060ff81ff] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] memblock_add: [0x0000000060ff8200-0x0000000060ffffff] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] MEMBLOCK configuration:
[Tue Dec  5 22:32:38 2023]  memory size = 0x0000000013fae750 reserved size = 0x0000000007b7cc50
[Tue Dec  5 22:32:38 2023]  memory.cnt  = 0x2
[Tue Dec  5 22:32:38 2023]  memory[0x0] [0x0000000000001000-0x000000000008efff], 0x000000000008e000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  memory[0x1] [0x000000004d0e1000-0x0000000060ffffff], 0x0000000013f1f000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved.cnt  = 0x5
[Tue Dec  5 22:32:38 2023]  reserved[0x0]       [0x0000000000000000-0x000000000000ffff], 0x0000000000010000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x1]       [0x000000000008f400-0x00000000000fffff], 0x0000000000070c00 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x2]       [0x0000000057b16000-0x000000005f610fff], 0x0000000007afb000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x3]       [0x0000000060ff81d0-0x0000000060ff821f], 0x0000000000000050 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x4]       [0x0000000060ffe000-0x0000000060ffefff], 0x0000000000001000 bytes flags: 0x0


2. What is the Version-Release number of the kernel:


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


6. Are you running any modules that not shipped with directly Fedora's kernel?:


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Reproducible: Always

Comment 1 Baoquan He 2024-07-15 03:31:21 UTC

Remember this has been fixed, close it.