Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1950885

Summary: Disable CMA in kdump 2nd kernel
Product: Red Hat Enterprise Linux 9 Reporter: Dave Young <ruyang>
Component: kexec-toolsAssignee: ltao
Status: CLOSED CURRENTRELEASE QA Contact: Jie Li <jieli>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.0CC: bhe, cye, dhildenb
Target Milestone: betaKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kexec-tools-2.0.22-2.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-07 21:50:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1945002    
Bug Blocks: 1951392    

Description Dave Young 2021-04-19 04:06:21 UTC
According to our discussion about https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1023

kexec-tools needs to disable CMA for kdump kernel cmdline.  Otherwise kdump kernel may run out of memory.

For example  one set crashkernel=1G,  but if cma=512M is used, then kdump kernel will only have 512M.   so we should strip the inherited cma= cmd line from 1st kernel grub.  And may need to explictly set cma=0 since David said s390/ppc has some internal logic to set CMA.

------------original bug for kernel enabling cma on x86--------------
This bug was initially created as a copy of Bug #1945002

I am copying this bug because: 



We want to enable/support CMA in RHEL9 on x86-64 (and eventually aarch64). Enabling CMA mostly involves enabling CONFIG_CMA and adding RHEL-specific warnings that CMA areas were defined / CMA allocations happened.

As one example CMA will be required to eventually support RDMA<->GPU p2pdma via dma-buf. As another example, CMA will be useful useful for more reliable runtime allocation of gigantic pages.

CMA is already enabled in ARK for s390x and ppc64le. For aarch64, 64k base page size currently implies a minimum CMA area size of 512 MiB (due to large pageblock order), which could be problematic on smaller machines.

Comment 2 David Hildenbrand 2021-04-19 09:00:33 UTC
IIRC, to build the kdump ("crashkernel") cmdline we are not using the cmdline of the original kernel (I did a quick experiment and it looks like that is the behavior indeed); we only base our cmdline on the original cmline in case of ordinary kexec.

However, on ppc64, it might make sense to specify "kvm_cma_resv_ratio=0" for the kdump kernel, because the default in the kernel is set to "5%".

Comment 6 ltao 2021-04-27 02:10:54 UTC
When cma enabled in 1st kernel cmdline, we will get the following string in vmcore-dmesg.txt
after triggering a kernel crash:

[    0.009622] cma: Reserved 512 MiB at 0x000000011fc0000000bfffffff] reserved
[    0.009625] hugetlb_cma: reserve 1024 MiB, up to 1024 MiB per node reserved
[    0.009626] cma: Reserved 1024 MiB at 0x00000000400000000ffffffff] reserved
[    0.009628] hugetlb_cma: reserved 1024 MiB on node 0000013fffffff] usable
[    0.009631] Reserving 256MB of memory at 2800MB for crashkernel (System RAM: 4095MB)
...skipping...
[    0.031134] Kernel command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.15-300.fc34.x86_64 root=UUID=ff9e6951-a257-4fc8-9788-de2fa1779103 ro rootflags=subvol=root console=ttyS0,115200 cma=512M hugetlb_cma=1G crashkernel=256M
...skipping...
[    0.053021] Memory: 2160092K/4193888K available (14345K kernel code, 3471K rwdata, 9740K rodata, 2540K init, 5512K bss, 460672K reserved, 1572864K cma-reserved)

In 2nd kernel dmesg(kexec-dmesg.log), cmdline cma= should be set to 0, and cma-reserved should be 0K:

[Mon Apr 26 21:19:19 2021] Kernel command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.15-300.fc34.x86_64 ro rootflags=subvol=root console=ttyS0,115200 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0 hugetlb_cma=0 disable_cpu_apicid=0 trace_buf_size=1 acpi_rsdp=0xf61e0 elfcorehdr=3128692K
[Mon Apr 26 21:19:19 2021] Memory: 195848K/262124K available (14345K kernel code, 3471K rwdata, 9740K rodata, 2540K init, 5512K bss, 66020K reserved, 0K cma-reserved)

Comment 16 Dave Young 2022-02-09 05:14:33 UTC
*** Bug 2051877 has been marked as a duplicate of this bug. ***