Bug 2104445
Summary: | RHEL9.1: in low memory conditions, page_frag_alloc may corrupt the memory. | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Maurizio Lombardi <mlombard> | ||||
Component: | kernel | Assignee: | Maurizio Lombardi <mlombard> | ||||
kernel sub component: | Memory Management | QA Contact: | Li Wang <liwan> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | unspecified | ||||||
Priority: | unspecified | CC: | chuhu, ddutile, liwan | ||||
Version: | 9.1 | Keywords: | Bugfix, Patch, Triaged | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-5.14.0-178.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-05-09 07:58:06 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Maybe I found the issue: in __page_frag_cache_refill() if the page allocation with order=3 fails, then it retries with order=0, thus allocating a 4096 byte cache page. if fragsize is > 4096 this will corrupt the memory. It looks like page_frag_alloc() is in general unsafe for fragsize > PAGE_SIZE; I wonder why this condition is not enforced in the code. Will work on a patch right now, I guess the solution is to check that nc->size (the cache size) is big enough for fragsize, otherwise page_frag_alloc() should return NULL to prevent memory corruptions. This patch should solve the problem: diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4dc0d333279f..fdd8d671876a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5550,6 +5550,8 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; offset = size - fragsz; + if (unlikely(offset < 0)) + return NULL; } nc->pagecnt_bias--; (In reply to Maurizio Lombardi from comment #3) > This patch should solve the problem: Tested, it seems to work, I improved it to avoid leaking cache pages. This is the current version: diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4dc0d333279f..c6b40b85c55d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5544,12 +5544,17 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, /* if size can vary use size else just use PAGE_SIZE */ size = nc->size; #endif - /* OK, page count is 0, we can safely set it */ - set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); - /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; offset = size - fragsz; + if (unlikely(offset < 0)) { + free_the_page(page, compound_order(page)); + nc->va = NULL; + return NULL; + } + + /* OK, page count is 0, we can safely set it */ + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); } nc->pagecnt_bias--; Patch submitted to upstream: https://lore.kernel.org/linux-mm/CAFL455nFxcrpezZENBHhMe_D7mE9N_v9mN9YjYQr1Z=-E3inug@mail.gmail.com/T/#ma152b5bbb5cd4749bc15854fd205beec02fa8679 Patch picked up by Andrew Morton https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-unstable (In reply to Maurizio Lombardi from comment #8) > Patch picked up by Andrew Morton > > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm- > unstable Correct link: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-unstable&id=6309f8daaef315140c8ffdd2492563973e8d42d5 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2458 |
Created attachment 1894892 [details] test kernel module Description of problem: Calling page_frag_alloc() with a fragsize > 4096 (on x86_64) corrupts the memory if the system is in OOM conditions and the kernel will crash when calling page_frag_free(). I was not able to make the kernel crash with fragsize <= 4096 Steps to Reproduce: I prepared a simple kernel module, it requires 2 parameters: the first one is the amount of memory you want to allocate with page_frag_alloc(), the second one is size of the fragment I tested it on a machine with ~7Gb of free memory. Example of output: 3Gb of memory will be used with frag size = 1024 byte. No issue: #insmod oomk.ko memory_size_gb=3 fragsize=1024 [ 177.875107] Test begins, memory size = 3 fragsize = 1024 [ 177.974538] Test completed! 10 Gb of memory, 1024 byte frag. page allocation failure but the kernel handles it and doesn't crash: #insmod oomk.ko memory_size_gb=10 fragsize=1024 [ 215.104801] Test begins, memory size = 10 fragsize = 1024 [ 215.227854] insmod: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 [ 215.230231] CPU: 1 PID: 1738 Comm: insmod Kdump: loaded Tainted: G OE --------- --- 5.14.0-124.kpq0.el9.x86_64 #1 [ 215.232344] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 215.233523] Call Trace: [ 215.234001] dump_stack_lvl+0x34/0x44 [ 215.234894] warn_alloc+0x134/0x160 [ 215.235592] __alloc_pages_slowpath.constprop.0+0x809/0x840 [ 215.236687] ? get_page_from_freelist+0xc6/0x500 [ 215.237569] __alloc_pages+0x1fa/0x230 [ 215.238381] page_frag_alloc_align+0x16c/0x1a0 [...] [ 215.315722] allocation number 7379888 failed! [ 215.426227] Test completed! 4Gb, 4097 byte frag. No issues: #insmod oomk.ko memory_size_gb=4 fragsize=4097 [ 417.268821] Test begins, memory size = 4 fragsize = 4097 [ 417.343840] Test completed! 10Gb, 4097 byte frag. Kernel crashes: #insmod oomk.ko memory_size_gb=10 fragsize=4097 [ 623.461505] BUG: Bad page state in process insmod pfn:10a80c [ 623.462634] page:000000000654dc14 refcount:0 mapcount:0 mapping:000000007a56d6cd index:0x0 pfn:0x10a80c [ 623.464401] memcg:ffff900343a5b501 [ 623.465058] aops:0xffff9003409e5d38 with invalid host inode 00003524480055f0 [ 623.466394] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 623.467632] raw: 0017ffffc0000000 dead000000000100 dead000000000122 ffff900346cf2900 [ 623.469069] raw: 0000000000000000 0000000000100010 00000000ffffffff ffff900343a5b501 [ 623.470521] page dumped because: page still charged to cgroup [...] [ 626.632838] general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP PTI [ 626.633913] ------------[ cut here ]------------ [ 626.639981] CPU: 0 PID: 722 Comm: agetty Kdump: loaded Tainted: G B OE --------- --- 5.14.0-124.kpq0.el9.x86_64 #1 [ 626.640923] WARNING: CPU: 1 PID: 22 at mm/slub.c:4566 __ksize+0xc4/0xe0 [ 626.645018] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 626.645021] RIP: 0010:___slab_alloc+0x1b7/0x5c0