Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2218682

Summary:	bpf_jit_limit hit again - copy_seccomp() fix
Product:	Red Hat Enterprise Linux 9	Reporter:	Peter Hunt <pehunt>
Component:	kernel	Assignee:	Viktor Malik <vmalik>
kernel sub component:	BPF	QA Contact:	Ziqian SUN (Zamir) <zsun>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	unspecified
Priority:	high	CC:	acme, asavkov, bhu, cye, jbenc, kcarcia, ldoskova, thoiland, travi, vmalik, wking, ykaliuta, zsun
Version:	9.2	Keywords:	Triaged, ZStream
Target Milestone:	rc	Flags:	pehunt: needinfo- pm-rhel: mirror+
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	kernel-5.14.0-342.el9	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	2219567 2226945 (view as bug list)		Environment:
Last Closed:	2023-11-07 08:48:41 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2226945

Description Peter Hunt 2023-06-29 20:02:18 UTC

Description of problem:
sibling to https://issues.redhat.com/browse/OCPBUGS-15043

similar to https://bugzilla.redhat.com/show_bug.cgi?id=2140163 except, the issue is hit again in openshift 4.13.3 which uses kernel 5.14.0-284.16.1.el9_2 as well as openshift 4.12.15 (which uses kernel 4.18.0-372.52.1.el8_6)

I can't find a bug for 9.2 version, so I'm not sure if it's fixed already, but I'd like one to track regardless. I also wonder if a comparatively newer patch (https://lore.kernel.org/bpf/20230321170925.74358-1-kuniyu@amazon.com/) is relevant and what the status of backport is for that as well.

Version-Release number of selected component (if applicable):
5.14.0-284.16.1.el9_2
4.18.0-372.52.1.el8_6

How reproducible:
from the issue above

```
Currently, they are unable to spin up the pods on this specific worker.
Although they do not see any resource overcommitment in the node describe.
```

Steps to Reproduce:
1. This worker node has 228 running pods
2.
3.

Actual results:

```
When customers tried to start deployment, and that deployments tried to run pods on worker number 6, those pods entered "CreateContainerError". Upon checking the events of these pods, all presented the error:

"runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524".
```

Expected results:

from the jira
```
Container should not be stuck on this error since this issue was addressed in 4.12.2 errata earlier

https://issues.redhat.com/browse/OCPBUGS-2637
https://issues.redhat.com/browse/RUN-1668
https://access.redhat.com/errata/RHBA-2023:0568
-> OCPBUGS-6981 - error 524 from seccomp(2) when trying to load filter [rhel-8.6.0.z]
```

interestingly, the kernel version listed in the bug above is newer than the one in 4.12.15, though the bug apparently went through errata. It may be an rhcos packaging problem, but wanted to open here to track el9 version as well

Additional info:

Comment 1 Viktor Malik 2023-07-03 10:51:45 UTC

Hi Peter,

the above patch [1] that you mention looks like it could resolve the issue. It has been recently backported to CentOS Stream 9 as a part of our regular BPF subsystem rebase and will appear in RHEL 9.3. So, in case we confirm that it is the necessary fix, we will need to backport it to 8.6 and 9.2 z-streams.

I crafted a Brew build for 9.2z with [1] included, so that we can test it:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=53691694

@Peter if you or someone could use this to check if it resolves the problem, that would be great. I'll post it to the original OCP Jira issue, too.

FWIW, the 5.14.0-284.16.1.el9_2 kernel also suffers from a memleak introduced by upstream commit [2]. There is a fix for it already [3], so we should backport that one to 9.2 z-stream, too.
But since this issue appears on 4.18.0-372.52.1.el8_6, too, which doesn't have [2], I'm fairly confident that we'll need to backport [1] anyways.

[1] https://github.com/torvalds/linux/commit/10ec8ca8ec1a2f04c4ed90897225231c58c124a7
[2] https://github.com/torvalds/linux/commit/3a15fb6ed92cb32b0a83f406aa4a96f28c9adbc3
[3] https://github.com/torvalds/linux/commit/a1140cb215fa13dcec06d12ba0c3ee105633b7c4

Comment 2 Viktor Malik 2023-07-03 20:44:05 UTC

I also crafted a Brew build for 8.6z with the mentioned fix included, in case it helps with testing:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=53692906

Comment 5 Viktor Malik 2023-07-04 11:21:45 UTC

Since the same issue likely affects RHEL 8, too, I created a copy of this bug for it: bz#2219567.

Comment 27 Viktor Malik 2023-07-11 11:37:02 UTC

Since this bug is for RHEL9, I'm going to use it to backport the memleak fix a1140cb215fa ("seccomp: Move copy_seccomp() to no failure path.") into 9.3.

Comment 42 errata-xmlrpc 2023-11-07 08:48:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6583