Bug 2302746 - prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) not enabled in kernel; should return ENOSYS, not EINVAL
Summary: prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) not enabled in kernel; should re...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-04 20:52 UTC by Steve
Modified: 2024-08-12 16:25 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-08-12 16:25:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 2254434 0 unspecified CLOSED SELinux is preventing chrome from using the 'execheap' accesses on a process. 2024-10-02 11:18:53 UTC

Description Steve 2024-08-04 20:52:17 UTC
1. Please describe the problem:

prctl() returns EINVAL with what seem to be valid arguments.

This is a snippet from an strace log for a C program reproducer to follow:

mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f07c583f000
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, 0x7f07c583f000, 4096, "test-mem-1") = -1 EINVAL (Invalid argument)

2. What is the Version-Release number of the kernel:
6.11.0-0.rc1.20240802gitc0ecd6388360.20.fc41.x86_64

3. Did it work previously in Fedora?

No. Reproduced with:

6.9.12-200.fc40.x86_64
5.17.5-300.fc36.x86_64

4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below:

Yes, C program reproducer to follow.

5. Does this problem occur with the latest Rawhide kernel?

Yes.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

There are no non-Fedora packages in any of the test configurations.

First observed while analyzing:

Bug 2254434 - SELinux is preventing chrome from using the 'execheap' accesses on a process.

See the attachments for strace logs captured while running chromium.

Comment 1 Steve 2024-08-04 20:59:36 UTC
C program reproducer.

Compile with:

$ cc -o prctl-test-1 prctl-test-1.c

$ cat prctl-test-1.c
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <time.h>

int main(void)
{
    void *mem_pointer = (void *)NULL;
    //void *mem_pointer = (void *)0x7f0000000000;
    size_t mem_len = 0x1000;
    const char *mem_name = "test-mem-1";
    int ret;

    mem_pointer = mmap(mem_pointer, mem_len, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    
    if (mem_pointer == MAP_FAILED) {
        printf("mmap->MAP_FAILED: %s\n", strerror(errno));
        exit(1);
    }

    ret = prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, mem_pointer, mem_len, mem_name);

    if (ret == -1) {
        printf("prctl: %s\n", strerror(errno));
        exit(1);
    }

    exit(0);
}

Comment 2 Steve 2024-08-04 21:03:59 UTC
Here is an strace snippet from Bug 2254434, Attachment 2040642 [details] ("strace log for process 12501"):

mmap(0x1b9e22116000, 524288, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x1b9e22116000
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, 0x1b9e22116000, 524288, "v8") = -1 EINVAL (Invalid argument)

Comment 3 Steve 2024-08-04 23:02:11 UTC
According to the F41 man page, PR_SET_VMA(2const), this kernel feature was implemented in Linux 5.17, yet the config files show that it has never been enabled in the kernel:

$ fgrep CONFIG_ANON_VMA_NAME /boot/config-5.17.5-300.fc36.x86_64 /boot/config-6.11.0-0.rc1.20240802gitc0ecd6388360.20.fc41.x86_64
/boot/config-5.17.5-300.fc36.x86_64:# CONFIG_ANON_VMA_NAME is not set
/boot/config-6.11.0-0.rc1.20240802gitc0ecd6388360.20.fc41.x86_64:# CONFIG_ANON_VMA_NAME is not set

If the feature is not enabled in the kernel, prctl() should not return EINVAL, but ENOSYS:

       ENOSYS          Function not implemented (POSIX.1-2001).

See: errno (3)            - number of last error

For the record, the feature was added in 2022:

mm: add a field to store names for private anonymous memory
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.17&id=9a10064f5625d5572c3626c1516e0bebc6c9fe9b

Comment 4 Steve 2024-08-04 23:55:26 UTC
For the record, chromium makes 40 calls to prctl() during startup.

Procedure:

$ strace -ff -o cb-1.strace /usr/bin/chromium-browser

Quit chromium.

$ fgrep 'prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME' cb-1.strace.* | wc -l
40

They are distributed as follows:

$ fgrep 'prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME' cb-1.strace.* | cut -d ' ' -f 5 | sort | uniq -c | sort -n
      8 "tls-mmap-allocator")
     12 "partition_alloc")
     20 "v8")

Comment 5 Steve 2024-08-05 00:00:01 UTC
Jakob: How important are these prctl() calls to chromium?

Comment 6 Steve 2024-08-05 05:03:04 UTC
There is a bigger problem.

There appear to be 81 prctl() subcommands:

$ whatis -w PR_\* | wc -l
81

(Run that command in F41, which has a separate man page for each of them.)

Apparently, there isn't a specific prctl() subcommand that lists which other prctl() subcommands are actually available at runtime:

$ whatis -w PR_\* | less

And the list keeps growing:

"Add prctl to allow userlevel TDX hypercalls"

[PATCH v2 0/2] Support userspace hypercalls for TDX
Fri, 26 Jul 2024
https://lkml.org/lkml/2024/7/26/899

Tested with:

man-pages-6.9.1-2.fc41.noarch

Comment 7 Jakob Kummerow 2024-08-12 13:04:28 UTC
#5: I don't know how to answer that question. I have no particular familiarity with these calls.
As you know, the calling code intentionally doesn't check the return value of that prctl() syscall, because it prefers to silently ignore any failures there, which matches the intuition that assigning a human-readable name to a reserved memory region isn't "load-bearing", it's just occasionally handy for debugging/analyzing. In light of this, it seems to me that Chromium doesn't care whether the kernel returns EINVAL or ENOSYS (or any other error code) here.

Comment 8 Steve 2024-08-12 16:25:17 UTC
(In reply to Jakob Kummerow from comment #7)
Thanks for your detailed reply.

I am closing this as not a bug, since the chromium code doesn't care what prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) returns.

That particular prctl() syscall is occasionally useful as a debugging aid, when it is enabled in the kernel.*

As a side note, Linus has a kernel dictum: "We do not break userspace!"
https://lkml.org/lkml/2012/12/23/75

Changing an error return code for a syscall could do that, so this bug can't be fixed anyway.

* Presumably, chromium developers could build a Fedora test kernel with that particular prctl() syscall enabled, if they really needed it for debugging.

For comparison, the Ubuntu kernel has it enabled:

$ grep CONFIG_ANON_VMA_NAME /boot/config-6.8.0-39-generic 
CONFIG_ANON_VMA_NAME=y

$ uname -a
Linux lm-test-3 6.8.0-39-generic #39~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jul 10 15:35:09 UTC 2 x86_64 x86_64 x86_64 GNU/Linux


Note You need to log in before you can comment on or make changes to this bug.