Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2054705

Summary: [tracker] nf_reinject calls nf_queue_entry_free on an already freed entry->state
Product: OpenShift Container Platform Reporter: Juan Luis de Sousa-Valadas <jdesousa>
Component: RHCOSAssignee: Micah Abbott <miabbott>
Status: CLOSED ERRATA QA Contact: Aashish Radhakrishnan <aaradhak>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.7CC: dornelas, ekovsky, fminafra, jkaur, jligon, miabbott, mrussell, nsharma, nstielau, rbolling, rhcos-triage, travier
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2062349 (view as bug list) Environment:
Last Closed: 2022-08-10 10:49:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2009786    
Bug Blocks: 2062349, 2062351, 2062352, 2062354    

Description Juan Luis de Sousa-Valadas 2022-02-15 14:28:42 UTC
OCP Version at Install Time:  N/A
RHCOS Version at Install Time: N/A
OCP Version after Upgrade (if applicable): 4.7
RHCOS Version after Upgrade (if applicable): 8.4
Platform: baremetal
Architecture: x86_64


> What are you trying to do? What is your use case?

Customer is using AquaSec on OpenShift, which is supported: https://catalog.redhat.com/software/containers/aquasec/aquasec/59798ba0ac3db93fa86114e5


> What happened? What went wrong or what did you expect?

When the customer enables one security feature related to networking, the kernel panics due to a bug in netfilter. Aqua has not told me exactly what is the bug, but they told me that this is fixed in RHEL 8.5 with kernel 4.8.0-348.

They also pointed to this bug: https://bugs.rockylinux.org/show_bug.cgi?id=170

Unfortunately the bug doesn't say much and I haven't been pointed to a specific change.

> What are the steps to reproduce your issue? Please try to reduce these steps to something that can be reproduced with a single RHCOS node.

Install AquaSec and enable network security features

Comment 1 Timothée Ravier 2022-02-15 14:35:30 UTC
RHCOS 4.7 will stay on RHEL 8.4. If there is a bug in the kernel in 8.4 then we need a kernel backtrace and we can ask the kernel team to backport a fix to RHEL 8.4.

Comment 11 Micah Abbott 2022-03-04 13:57:01 UTC
It looks like this problem has been root-caused to https://bugzilla.redhat.com/show_bug.cgi?id=2009786

I'm going to update the summary of the BZ and setup the Depends On field accordingly


In terms of delivery into OCP/RHCOS 4.7, the 8.4.z request on 2009786 needs to be acked, then once the 8.4.z BZ is created, we can start tracking the delivery of the fix.

Assuming we get the 8.4.z fix, it will get included in an RHCOS 4.7 build the day it is released to RHEL customers.  Then it is typically 1-2 weeks for the RHCOS build to be included in an OCP z-stream payload.

Comment 14 Micah Abbott 2022-03-09 15:33:28 UTC
In order to properly track this through the OCP build/release process, this BZ is being retargeted for 4.11 (the current version under development).

I'll create clones for 4.10/4.9/4.8/4.7 to track inclusion of the RHEL fix from BZ#2009786 in those releases.

Comment 15 Micah Abbott 2022-03-09 15:52:49 UTC
I've created the following clones for all the z-stream releases down to 4.7.z

4.10z. - https://bugzilla.redhat.com/show_bug.cgi?id=2062349 
4.9.z - https://bugzilla.redhat.com/show_bug.cgi?id=2062351 
4.8.z - https://bugzilla.redhat.com/show_bug.cgi?id=2062352 
4.7.z - https://bugzilla.redhat.com/show_bug.cgi?id=2062354

Comment 16 Micah Abbott 2022-03-09 16:01:11 UTC
*** Bug 2009745 has been marked as a duplicate of this bug. ***

Comment 17 Micah Abbott 2022-06-14 13:27:03 UTC
RHCOS 4.11 has rebased to RHEL 8.6; the fixed package `kernel-4.18.0-372.9.1.el8` first appeared in 411.86.202206062029-0

Comment 19 Aashish Radhakrishnan 2022-06-17 15:44:06 UTC
Verified:

sh-4.4# rpm -q kernel 
kernel-4.18.0-372.9.1.el8.x86_64

sh-4.4# rpm-ostree status
State: idle
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:50772552e45a9e42287cc479bd5ecad826c136ae716f19623c963a9a122f84c0
              CustomOrigin: Managed by machine-config-operator
                   Version: 411.86.202206131434-0 (2022-06-13T14:37:45Z)

  e0746a6268898ad4761a04d8c531ee3a45250866d5c62bfeb8b0efc008ffb8e9
                   Version: 411.85.202205101201-0 (2022-05-10T12:05:02Z)

Comment 21 errata-xmlrpc 2022-08-10 10:49:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069