1977984 – [nm-cloud-setup] [AWS] default route inter-feres with other subnets on the VM (making containers unreachable) [rhel-8.5]

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1977984 - [nm-cloud-setup] [AWS] default route inter-feres with other subnets on the VM (making containers unreachable) [rhel-8.5]

Summary: [nm-cloud-setup] [AWS] default route inter-feres with other subnets on the VM...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	NetworkManager
Sub Component:
Version:	8.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Thomas Haller
QA Contact:	Desktop QE
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1995503 (view as bug list)
Depends On:
Blocks:	1998570 2007341
TreeView+	depends on / blocked

Reported:	2021-06-30 20:34 UTC by Russell Teague
Modified:	2024-12-20 20:22 UTC (History)
CC List:	27 users (show)
Fixed In Version:	NetworkManager-1.32.10-4.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1998570 2007341 (view as bug list)
Environment:
Last Closed:	2021-11-09 19:30:32 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	liangxiao1/os-tests/blob/dba621751bc4c97355f213c8750990e55adb9a31/os_tests/tests/test_network_test.py#L401	None	None	None	2021-11-25 01:51:48 UTC
Red Hat Issue Tracker	RHELPLAN-94569	None	None	None	2021-08-24 08:57:59 UTC
Red Hat Product Errata	RHSA-2021:4361	None	None	None	2021-11-09 19:31:12 UTC
freedesktop.org Gitlab	NetworkManager NetworkManager merge_requests 974	None	None	None	2021-09-01 17:52:04 UTC

Description Russell Teague 2021-06-30 20:34:38 UTC

Description of problem:
After scaling up RHEL8 worker nodes and removing RHCOS nodes from the cluster, e2e-tests fail due to TargetDown.

openshift-sdn Alert Details:
16.67% of the sdn/sdn targets in openshift-sdn namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support.

All SDN pods are reporting as running.
All resources on the cluster appear to be functioning normally.
Only the alert is any indication of an issue.


Version-Release number of selected component (if applicable): 4.9


How reproducible:
This is happening on all CI jobs using RHEL8 workers.
PR for adding RHEL8 worker support to CI:
https://github.com/openshift/release/pull/19190

Example job failure:
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/19190/rehearse-19190-pull-ci-openshift-openshift-ansible-master-e2e-aws-workers-rhel8/1410260856570122240

Test failure summary:
alert TargetDown fired for 829 seconds with labels: {job="crio", namespace="kube-system", service="kubelet", severity="warning"}
alert TargetDown fired for 829 seconds with labels: {job="kubelet", namespace="kube-system", service="kubelet", severity="warning"}
alert TargetDown fired for 829 seconds with labels: {job="machine-config-daemon", namespace="openshift-machine-config-operator", service="machine-config-daemon", severity="warning"}
alert TargetDown fired for 859 seconds with labels: {job="node-exporter", namespace="openshift-monitoring", service="node-exporter", severity="warning"}
alert TargetDown fired for 859 seconds with labels: {job="sdn", namespace="openshift-sdn", service="sdn", severity="warning"}


This issue can be reproduced in a development cluster.

Comment 1 Russell Teague 2021-07-01 15:12:48 UTC

Observed that the alerts do not start firing until after the RHCOS nodes are drained/removed from the cluster.  Just having RHEL8 nodes in the cluster does not cause the condition.

Comment 2 Russell Teague 2021-07-02 17:59:12 UTC

Observed that when prometheus-k8s pods are running on the RHEL8 nodes, alerts are raised as noted in the 'Test failure summary' in the bug description.  When the prometheus-k8s pods are moved back to RHCOS nodes, the alerts are cleared.  The prometheus-k8s pods are not reporting any issues when running on the RHEL8 nodes.

Comment 3 Russell Teague 2021-07-02 18:02:04 UTC

Moving to the monitoring component based on the above observations.

Comment 4 Arunprasad Rajkumar 2021-07-05 09:54:52 UTC

Following analysis made by @spasquie 

>i've had a quick look at this one and from what I can tell, Prometheus considers the ip-10-0-159-119.ec2.internal node because it can't scrape the metrics. It's visible from the targets API (see https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_rele[…]extra/artifacts/metrics/prometheus-targets.json) which says Get \"http://10.0.159.119:9537/metrics\": context deadline exceeded for the endpoint (it smells like a network connectivity issue as Prometheus fails to receive metrics within 10s).
I'd suggest having a look at the node logs: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_rele[…]a/artifacts/nodes/ip-10-0-159-119.ec2.internal/


I've also looked at the node journal, I see the following log at the end

> Jun 30 17:57:31.418776 ip-10-0-159-119.ec2.internal hyperkube[1637]: W0630 17:57:31.418619    1637 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/csi-hostpath-e2e-provisioning-3358/csi.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/csi-hostpath-e2e-provisioning-3358/csi.sock: connect: connection refused". Reconnecting...
Jun 30 17:57:31.418776 ip-10-0-159-119.ec2.internal hyperkube[1637]: I0630 17:57:31.418653    1637 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc006e6dee0, {TRANSIENT_FAILURE connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/csi-hostpath-e2e-provisioning-3358/csi.sock: connect: connection refused"}
Jun 30 17:57:31.418776 ip-10-0-159-119.ec2.internal hyperkube[1637]: E0630 17:57:31.418735    1637 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi-hostpath-e2e-provisioning-3358^9779380d-d9ca-11eb-b269-0a580a8005fa podName:04333bd3-fab0-4e74-9b0f-585dbdc35236 nodeName:}" failed. No retries permitted until 2021-06-30 17:59:33.418710717 +0000 UTC m=+2240.882179472 (durationBeforeRetry 2m2s). Error: "UnmountVolume.TearDown failed for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/csi-hostpath-e2e-provisioning-3358^9779380d-d9ca-11eb-b269-0a580a8005fa\") pod \"04333bd3-fab0-4e74-9b0f-585dbdc35236\" (UID: \"04333bd3-fab0-4e74-9b0f-585dbdc35236\") : kubernetes.io/csi: mounter.TearDownAt failed: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/lib/kubelet/plugins/csi-hostpath-e2e-provisioning-3358/csi.sock: connect: connection refused\""


It indicates a network connectivity issue with the problematic node instance. Let me assign this to network team for their analysis.

Comment 5 Russell Teague 2021-07-13 18:13:30 UTC

Bumping the severity because this is impacting feature delivery.

Comment 6 Casey Callendrello 2021-07-28 13:23:03 UTC

This is probably not a networking issue:

1. More than just openshift-sdn targets are failing
2. The connection failure you see is a unix domain socket, i.e. a file. That's not actually networking; it's an attempt to talk to the CSI driver.

My first guess is that some alerts need to be tweaked to take losing nodes in to account.

One question: when the RHCOS nodes are removed, are the Node objects also removed?

Comment 7 Russell Teague 2021-07-28 15:24:35 UTC

In my tests to troubleshoot this issue I have cordoned and drained the RHCOS nodes.  As stated in comment 2, the alerts started firing.  The RHCOS nodes were not removed from the cluster.  When the RHCOS nodes were uncordoned and the RHEL nodes cordoned and drained, the alerts were cleared.  I do not think this is related to losing nodes.  My guess was that either the wrong version of a necessary package was being installed on RHEL8 or that a file or resource is not in the right place on RHEL8 as expected on either RHCOS or RHEL7.  From a networking perspective, I wanted to make sure the correct packages were being installed for RHEL8.

Who owns these alerts?

Comment 8 Simon Pasquier 2021-07-28 16:23:09 UTC

The monitoring team owns the TargetDown alert but the reason why the alert fires might be outside of our competency. Having said that, we can look into https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/19190/rehearse-19190-periodic-ci-openshift-release-master-nightly-4.9-e2e-aws-workers-rhel8/1420025520803811328 which is an example of the failure with gathered data.

What I can already say from this run is that the node triggering the alert is ip-10-0-212-204.us-east-2.compute.internal. The node is reported as Ready but for some reason, Prometheus is failing to connect with "Get \"https://10.0.212.204:10250/metrics\": context deadline exceeded". If you have access to a cluster in the same configuration/situation, it's be worth checking that you can curl the /metrics endpoint from within the prometheus-k8s-0 pod.

Comment 10 Russell Teague 2021-07-28 20:28:26 UTC

I can build a cluster that reproduces this behavior.  I have one built now and will tear it down soon but will build another tomorrow morning.

In trying to diagnose the issue as you mention in comment 8 I have attempted to curl the /metrics endpoint but I may be doing it wrong.

1. Find the node where the prometheus-k8s-0 pod is running.
   $ oc get pod prometheus-k8s-0 --namespace openshift-monitoring -o yaml | grep nodeName

2. Connect to that node and find the prometheus container.
   $ oc debug node/ip-10-0-172-36.us-east-2.compute.internal
   sh-4.4# chroot /host
   sh-4.4# crictl ps | grep -E 'prometheus\s' | awk '{print $1}'

3. Exec into that container and run curl to another worker node.
   # crictl exec -it 363ec8ed31264 /bin/bash
   bash-4.4$ curl https://10.0.197.254:10250/metrics

I'm getting certificate issues:

bash-4.4$ curl https://10.0.197.254:10250/metrics
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

bash-4.4$ curl -k https://10.0.197.254:10250/metrics
Unauthorized
bash-4.4$ 


Let me know if I'm doing something wrong.  Also, how do I tell which node is causing the alert to fire?

Comment 11 Simon Pasquier 2021-07-29 05:56:17 UTC

Thanks Russell. Can you try the following commands?

oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -t -- curl -k https://x.x.x.x:10250/metrics
oc exec -n openshift-monitoring prometheus-k8s-1 -c prometheus -t -- curl -k https://x.x.x.x:10250/metrics

To find which IP address to connect to, you can open the Prometheus UI, go to the Targets page and find out which targets are down.

Comment 12 Russell Teague 2021-07-29 15:55:20 UTC

I built a cluster and tested out the commands:

$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -t -- curl -k https://10.0.129.155:10250/metrics
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:02:11 --:--:--     0curl: (7) Failed to connect to 10.0.129.155 port 10250: Connection timed out
command terminated with exit code 7


$ oc exec -n openshift-monitoring prometheus-k8s-1 -c prometheus -t -- curl -k https://10.0.129.155:10250/metrics
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    12  100    12    0     0    923      0 --:--:-- --:--:-- --:--:--   923Unauthorized


I can provide access to the cluster for troubleshooting today, but it will be gone tomorrow.

Comment 13 Simon Pasquier 2021-07-29 16:07:28 UTC

So we need to understand why prometheus-k8s-0 can't connect to 10.0.129.155. Is the node being seen as ready?

Comment 14 Russell Teague 2021-07-29 16:58:35 UTC

Yes, node is Ready.

$ oc get nodes -o wide
NAME                                         STATUS   ROLES    AGE    VERSION           INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
ip-10-0-129-155.us-east-2.compute.internal   Ready    worker   98m    v1.21.1+051ac4f   10.0.129.155   <none>        Red Hat Enterprise Linux 8.4 (Ootpa)                           4.18.0-305.10.2.el8_4.x86_64   cri-o://1.21.2-7.rhaos4.8.gitdd89bfb.el8
ip-10-0-131-227.us-east-2.compute.internal   Ready    master   136m   v1.21.1+051ac4f   10.0.131.227   <none>        Red Hat Enterprise Linux CoreOS 48.84.202107282126-0 (Ootpa)   4.18.0-305.10.2.el8_4.x86_64   cri-o://1.21.2-7.rhaos4.8.gitdd89bfb.el8
ip-10-0-173-217.us-east-2.compute.internal   Ready    master   137m   v1.21.1+051ac4f   10.0.173.217   <none>        Red Hat Enterprise Linux CoreOS 48.84.202107282126-0 (Ootpa)   4.18.0-305.10.2.el8_4.x86_64   cri-o://1.21.2-7.rhaos4.8.gitdd89bfb.el8
ip-10-0-187-192.us-east-2.compute.internal   Ready    worker   98m    v1.21.1+051ac4f   10.0.187.192   <none>        Red Hat Enterprise Linux 8.4 (Ootpa)                           4.18.0-305.10.2.el8_4.x86_64   cri-o://1.21.2-7.rhaos4.8.gitdd89bfb.el8
ip-10-0-194-140.us-east-2.compute.internal   Ready    worker   98m    v1.21.1+051ac4f   10.0.194.140   <none>        Red Hat Enterprise Linux 8.4 (Ootpa)                           4.18.0-305.10.2.el8_4.x86_64   cri-o://1.21.2-7.rhaos4.8.gitdd89bfb.el8
ip-10-0-221-183.us-east-2.compute.internal   Ready    master   137m   v1.21.1+051ac4f   10.0.221.183   <none>        Red Hat Enterprise Linux CoreOS 48.84.202107282126-0 (Ootpa)   4.18.0-305.10.2.el8_4.x86_64   cri-o://1.21.2-7.rhaos4.8.gitdd89bfb.el8


And as you mentioned in slack, the pod is running on the same node it is unable to reach:
$ oc get pod prometheus-k8s-0 --namespace openshift-monitoring -o yaml | grep nodeName
  nodeName: ip-10-0-129-155.us-east-2.compute.internal

Comment 15 Russell Teague 2021-08-02 19:13:59 UTC

After further discussion with the networking team it was determined to be a networking issue.
https://coreos.slack.com/archives/CK1AE4ZCK/p1627673448016900

Comment 16 Russell Teague 2021-08-09 16:38:12 UTC

Any update on this bug?

Comment 17 Martin Kennelly 2021-08-16 08:47:12 UTC

I am looking into it. No progress yet. I am trying to add RHEL8 node to my cluster and its taking time to set it up. I am following this article: https://docs.openshift.com/container-platform/4.8/machine_management/adding-rhel-compute.html
Is that the best way for adding rhel8 instance on an aws cloud?

Comment 19 Martin Kennelly 2021-08-18 15:15:34 UTC

packets being dropped in PREROUTING iptable. No clear reason why. 

Here is a normal packet trace on RHCOS from pod -> local node port:
trace id a690161c ip raw PREROUTING verdict continue
trace id a690161c ip raw PREROUTING policy accept
trace id a690161c ip mangle PREROUTING verdict continue
trace id a690161c ip mangle PREROUTING policy accept
trace id a690161c ip nat PREROUTING packet: iif "tun0" ether saddr 0a:58:0a:83:00:17 ether daddr 6a:bf:49:0d:fb:ef ip saddr 10.131.0.23 ip daddr 10.0.216.33 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 59314 ip length 60 tcp sport 59999 tcp
trace id a690161c ip nat PREROUTING rule  counter packets 24318 bytes 2648085 jump KUBE-SERVICES (verdict jump KUBE-SERVICES)
trace id a690161c ip nat KUBE-SERVICES rule  fib daddr type local counter packets 2458 bytes 264399 jump KUBE-NODEPORTS (verdict jump KUBE-NODEPORTS)
trace id a690161c ip nat KUBE-NODEPORTS verdict continue
trace id a690161c ip nat KUBE-SERVICES verdict continue
trace id a690161c ip nat PREROUTING rule  counter packets 18586 bytes 2020937 jump KUBE-PORTALS-CONTAINER (verdict jump KUBE-PORTALS-CONTAINER)
trace id a690161c ip nat KUBE-PORTALS-CONTAINER verdict continue
trace id a690161c ip nat PREROUTING rule fib daddr type local  counter packets 17798 bytes 1960581 jump KUBE-NODEPORT-CONTAINER (verdict jump KUBE-NODEPORT-CONTAINER)
trace id a690161c ip nat KUBE-NODEPORT-CONTAINER verdict continue
trace id a690161c ip nat PREROUTING verdict continue
trace id a690161c ip nat PREROUTING policy accept
trace id a690161c ip mangle INPUT verdict continue
trace id a690161c ip mangle INPUT policy accept
...
...

Here is a packet trace on RHEL 8 from pod -> local node port:
trace id 4078a633 ip raw PREROUTING verdict continue
trace id 4078a633 ip raw PREROUTING policy accept
trace id 4078a633 ip mangle PREROUTING verdict continue
trace id 4078a633 ip mangle PREROUTING policy accept
trace id 4078a633 ip nat PREROUTING packet: iif "tun0" ether saddr 0a:58:0a:82:02:06 ether daddr ee:cd:bc:c2:83:b7 ip saddr 10.130.2.6 ip daddr 10.0.244.172 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 30075 ip length 60 tcp sport 59999 tcp dport 10250 tcp flags == syn tcp window 26733
trace id 4078a633 ip nat PREROUTING rule  counter packets 976 bytes 95117 jump KUBE-SERVICES (verdict jump KUBE-SERVICES)
trace id 4078a633 ip nat KUBE-SERVICES rule  fib daddr type local counter packets 486 bytes 46467 jump KUBE-NODEPORTS (verdict jump KUBE-NODEPORTS)
trace id 4078a633 ip nat KUBE-NODEPORTS verdict continue
trace id 4078a633 ip nat KUBE-SERVICES verdict continue
trace id 4078a633 ip nat PREROUTING rule  counter packets 926 bytes 91163 jump KUBE-PORTALS-CONTAINER (verdict jump KUBE-PORTALS-CONTAINER)
trace id 4078a633 ip nat KUBE-PORTALS-CONTAINER verdict continue
trace id 4078a633 ip nat PREROUTING rule fib daddr type local  counter packets 922 bytes 90659 jump KUBE-NODEPORT-CONTAINER (verdict jump KUBE-NODEPORT-CONTAINER)
trace id 4078a633 ip nat KUBE-NODEPORT-CONTAINER verdict continue
trace id 4078a633 ip nat PREROUTING verdict continue
trace id 4078a633 ip nat PREROUTING policy accept

Packet just gets dropped. Doesn't move onto INPUT chain from PREROUTING.

Comment 20 Martin Kennelly 2021-08-21 15:07:52 UTC

This does not look like an Openshift SDN bug because there is no rule why the packet is dropped post PREROUTING. PREROUTING chain default is accept.
This works on RHCOS but not on RHEL8.
Both distro's have the same kernel and iptables (nf tables) version.

I checked the tunable kernel parameters between RHCOS and RHEL8 and there doesn't seem to be any differences that could cause this issue.
I checked if rf_filter is enabled in strict mode on the interface and it is not.

I am stumped.
I have asked for help and I am waiting for a response.

Comment 21 Martin Kennelly 2021-08-21 17:58:08 UTC

I confirm it's working for Red Hat Enterprise Linux 8.3 and therefore it's a regression for Red Hat Enterprise Linux 8.4.

Comment 22 Russell Teague 2021-08-23 14:19:35 UTC

Tested 8.3 in CI [1] and was able to get several passing jobs [2].

[1] https://github.com/openshift/release/pull/19190
[2] https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/19190/rehearse-19190-pull-ci-openshift-openshift-ansible-master-e2e-aws-workers-rhel8/1429628496895807488

Comment 23 Martin Kennelly 2021-08-23 17:45:36 UTC

I cannot see any differences for Openshift SDN in RHEL 8.3 and 8.4. OVS Flow paths are the same. Iptables rules hit are the same (for PREROUTING chain).

For reference, I am testing against 8.3 and 8.4 AWS images, however, both instances have the same installed package versions and same kernel version. I did not expect this and according to the release notes for 8.4, the kernel was updated to .305 but I see this version for 8.3 also.. [1]

Russell, can we move this to RHEL team? It doesn't look like SDN bug. 

[1] Red Hat Enterprise Linux 8.4 (Ootpa)                           4.18.0-305.12.1.el8_4.x86_64 (AMI image number: ami-0b0af3577fe5e3532)
    Red Hat Enterprise Linux 8.3 (Ootpa)                           4.18.0-305.12.1.el8_4.x86_64 (AMI image number: ami-01d12f05657cd01d3)

Comment 24 Russell Teague 2021-08-23 18:32:23 UTC

I'm fine with moving this to the RHEL team but I don't know what component to move it to.  Reassign as you see fit.

Comment 26 Eric Garver 2021-08-24 11:32:23 UTC

(In reply to Martin Kennelly from comment #19)
[..]
> Here is a packet trace on RHEL 8 from pod -> local node port:
> trace id 4078a633 ip raw PREROUTING verdict continue
> trace id 4078a633 ip raw PREROUTING policy accept
> trace id 4078a633 ip mangle PREROUTING verdict continue
> trace id 4078a633 ip mangle PREROUTING policy accept
> trace id 4078a633 ip nat PREROUTING packet: iif "tun0" ether saddr
> 0a:58:0a:82:02:06 ether daddr ee:cd:bc:c2:83:b7 ip saddr 10.130.2.6 ip daddr
> 10.0.244.172 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 30075 ip length 60
> tcp sport 59999 tcp dport 10250 tcp flags == syn tcp window 26733
> trace id 4078a633 ip nat PREROUTING rule  counter packets 976 bytes 95117
> jump KUBE-SERVICES (verdict jump KUBE-SERVICES)
> trace id 4078a633 ip nat KUBE-SERVICES rule  fib daddr type local counter
> packets 486 bytes 46467 jump KUBE-NODEPORTS (verdict jump KUBE-NODEPORTS)
> trace id 4078a633 ip nat KUBE-NODEPORTS verdict continue
> trace id 4078a633 ip nat KUBE-SERVICES verdict continue
> trace id 4078a633 ip nat PREROUTING rule  counter packets 926 bytes 91163
> jump KUBE-PORTALS-CONTAINER (verdict jump KUBE-PORTALS-CONTAINER)
> trace id 4078a633 ip nat KUBE-PORTALS-CONTAINER verdict continue
> trace id 4078a633 ip nat PREROUTING rule fib daddr type local  counter
> packets 922 bytes 90659 jump KUBE-NODEPORT-CONTAINER (verdict jump
> KUBE-NODEPORT-CONTAINER)
> trace id 4078a633 ip nat KUBE-NODEPORT-CONTAINER verdict continue
> trace id 4078a633 ip nat PREROUTING verdict continue
> trace id 4078a633 ip nat PREROUTING policy accept
> 
> Packet just gets dropped. Doesn't move onto INPUT chain from PREROUTING.

Was the packet meant for the host? Packets only go to the INPUT chain if they're meant for the host, e.g. IP matches the host's.

Comment 27 Martin Kennelly 2021-08-24 12:03:09 UTC

> Was the packet meant for the host? Packets only go to the INPUT chain if they're meant for the host, e.g. IP matches the host's.

yes. Target IP matched the hosts IP.

Comment 28 Eric Garver 2021-08-24 12:04:21 UTC

(In reply to Martin Kennelly from comment #21)
> I confirm it's working for Red Hat Enterprise Linux 8.3 and therefore it's a
> regression for Red Hat Enterprise Linux 8.4.

This statement contradicts comment 23. In comment 23 you said the package/kernel versions were the same. That shouldn't be the case. They should definitely have different kernel versions. RHEL-8.3 is kernel-4.18.0-240.el8.

Is this issue always reproducible? Or only sometimes?

Comment 29 Eric Garver 2021-08-24 12:33:46 UTC

Can someone gather some statistics from a failing node?

 1. cat /proc/net/stat/nf_conntrack

 2. cat /proc/net/dev

I think the above may be available in a must-gather. It should be in a sosreport. This would be better as it will get loads of other useful data.

Comment 30 Martin Kennelly 2021-08-24 13:17:21 UTC

> This statement contradicts comment 23. In comment 23 you said the package/kernel versions were the same. That shouldn't be the case. They should definitely have different kernel versions. RHEL-8.3 is kernel-4.18.0-240.el8.

I didn't understand when I wrote that comment that when I added a RHEL 8.3 node to a OCP 4.9 cluster, OCP updates numerous components/packages, including the kernel version to match RHEL 8.4, so it works when OCP updates 8.3 -> 8.4 but doesn't work on RHEL 8.4. Therefore, a change in 8.4 that is not managed by OCP is breaking it.

Comment 31 Martin Kennelly 2021-08-24 13:18:12 UTC

> Is this issue always reproducible? Or only sometimes?
Always reproducible on RHEL 8.4

Comment 32 Eric Garver 2021-08-24 13:31:51 UTC

(In reply to Martin Kennelly from comment #30)
> > This statement contradicts comment 23. In comment 23 you said the package/kernel versions were the same. That shouldn't be the case. They should definitely have different kernel versions. RHEL-8.3 is kernel-4.18.0-240.el8.
> 
> I didn't understand when I wrote that comment that when I added a RHEL 8.3
> node to a OCP 4.9 cluster, OCP updates numerous components/packages,
> including the kernel version to match RHEL 8.4, so it works when OCP updates
> 8.3 -> 8.4 but doesn't work on RHEL 8.4. Therefore, a change in 8.4 that is
> not managed by OCP is breaking it.

Is there any chance you can isolate the component? i.e. only upgrade the kernel? Or selectively downgrade the kernel afterwards.

Comment 33 Russell Teague 2021-08-24 14:24:51 UTC

I'm building a test cluster with RHEL 8.4 and will pull the sosreport from the effected node.

Comment 34 Russell Teague 2021-08-24 14:28:09 UTC

These are the packages installed/upgraded when adding RHEL nodes to a cluster:
https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_node/defaults/main.yml#L12-L98

I can provide the install logs which show the packages and exact versions installed/upgraded.

Comment 35 Eric Garver 2021-08-24 14:44:21 UTC

(In reply to Russell Teague from comment #34)
> These are the packages installed/upgraded when adding RHEL nodes to a
> cluster:
> https://github.com/openshift/openshift-ansible/blob/master/roles/
> openshift_node/defaults/main.yml#L12-L98

Notably missing from that list is "iptables" userspace. It may get pulled in by the other updates though, e.g. "iptables-services". Can you verify?

> I can provide the install logs which show the packages and exact versions installed/upgraded.

Yes please.

Comment 38 Russell Teague 2021-08-24 16:01:35 UTC

sosreport and scaleup log are attached.

To find the packages/versions installed/upgraded search the scaleup log for "TASK [openshift_node : Install openshift packages]".  Each host will be listed in succession.

iptables-services is installed:
Installed Packages
iptables-libs.x86_64                1.8.4-17.el8                @anaconda                     
iptables-services.x86_64            1.8.4-17.el8                @rhel-8-for-x86_64-baseos-rpms

Comment 39 Eric Garver 2021-08-24 17:30:01 UTC

Fix component. This is openshift-sdn which means iptables. Not nftables.

Comment 40 Eric Garver 2021-08-25 19:42:19 UTC

(In reply to Eric Garver from comment #32)
> (In reply to Martin Kennelly from comment #30)
> > > This statement contradicts comment 23. In comment 23 you said the package/kernel versions were the same. That shouldn't be the case. They should definitely have different kernel versions. RHEL-8.3 is kernel-4.18.0-240.el8.
> > 
> > I didn't understand when I wrote that comment that when I added a RHEL 8.3
> > node to a OCP 4.9 cluster, OCP updates numerous components/packages,
> > including the kernel version to match RHEL 8.4, so it works when OCP updates
> > 8.3 -> 8.4 but doesn't work on RHEL 8.4. Therefore, a change in 8.4 that is
> > not managed by OCP is breaking it.
> 
> Is there any chance you can isolate the component? i.e. only upgrade the
> kernel? Or selectively downgrade the kernel afterwards.

Martin, can you try this?

Comment 43 Russell Teague 2021-08-26 13:36:56 UTC

I built two identical clusters, one using RHEL 8.3 worker nodes, one using RHEL 8.4 worker nodes.  I pulled the installed package list from one node of each and attached them to the bug.  A diff checker can be used to compare the two files and determine package version differences.  If there are specific packages that should be downgraded to specific versions I can attempt to make those changes.

Comment 44 Martin Kennelly 2021-08-26 13:46:19 UTC

> Martin, can you try this?
I don't have the time currently until next week.

Russell, can you do this to speed this up?

Comment 45 Russell Teague 2021-08-26 14:19:31 UTC

(In reply to Martin Kennelly from comment #44)
> > Martin, can you try this?
> I don't have the time currently until next week.
> 
> Russell, can you do this to speed this up?

Yes, I can attempt any package changes.  With the kernel specifically, both nodes are running the same kernel version so it doesn't make sense to downgrade the kernel on the 8.4 host to the original version on the 8.3 host.  I provided the installed package lists so that I could get better direction on which packages to change instead of working through every package difference.

Comment 46 Eric Garver 2021-08-26 19:54:08 UTC

(In reply to Russell Teague from comment #45)
> (In reply to Martin Kennelly from comment #44)
> > > Martin, can you try this?
> > I don't have the time currently until next week.
> > 
> > Russell, can you do this to speed this up?
> 
> Yes, I can attempt any package changes.  With the kernel specifically, both
> nodes are running the same kernel version so it doesn't make sense to
> downgrade the kernel on the 8.4 host to the original version on the 8.3
> host.  I provided the installed package lists so that I could get better
> direction on which packages to change instead of working through every
> package difference.

kernel and iptables are a good start.

Comment 47 Russell Teague 2021-08-26 20:31:43 UTC

What version of kernel and iptables should be installed on the 8.4 host?

As can be seen in the attached package lists, the kernel and iptables are the same version between both hosts.

8.3
Linux ip-10-0-139-135.us-east-2.compute.internal 4.18.0-305.12.1.el8_4.x86_64 #1 SMP Mon Jul 26 08:06:24 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

Installed Packages
iptables.x86_64                         1.8.4-17.el8          @rhel-8-for-x86_64-baseos-rpms
iptables-libs.x86_64                    1.8.4-17.el8          @rhel-8-for-x86_64-baseos-rpms
iptables-services.x86_64                1.8.4-17.el8          @rhel-8-for-x86_64-baseos-rpms


8.4
Linux ip-10-0-154-44.us-east-2.compute.internal 4.18.0-305.12.1.el8_4.x86_64 #1 SMP Mon Jul 26 08:06:24 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

Installed Packages
iptables.x86_64                         1.8.4-17.el8          @rhel-8-for-x86_64-baseos-rpms
iptables-libs.x86_64                    1.8.4-17.el8          @anaconda
iptables-services.x86_64                1.8.4-17.el8          @rhel-8-for-x86_64-baseos-rpms


The following is a concise list of package differences between the hosts, 8.3 ... 8.4:
$ diff -y --suppress-common-lines ../aws-4b/packagelist-8.3-norepo.txt ../aws-4a/packagelist-8.4-norepo.txt
							      >	NetworkManager-cloud-setup.x86_64 1:1.30.0-10.el8_4
bash.x86_64 4.4.19-12.el8				      |	bash.x86_64 4.4.20-1.el8_4
bind-export-libs.x86_64 32:9.11.20-5.el8		      |	bind-export-libs.x86_64 32:9.11.26-4.el8_4
brotli.x86_64 1.0.6-2.el8				      |	brotli.x86_64 1.0.6-3.el8
cloud-init.noarch 19.4-11.el8_3.2			      |	cloud-init.noarch 20.3-10.el8_4.2
cpio.x86_64 2.12-8.el8					      |	cpio.x86_64 2.12-10.el8
crontabs.noarch 1.11-16.20150630git.el8			      |	crontabs.noarch 1.11-17.20190603git.el8
crypto-policies.noarch 20200713-1.git51d1222.el8	      |	crypto-policies.noarch 20210209-1.gitbfb6bed.el8_3
crypto-policies-scripts.noarch 20200713-1.git51d1222.el8      |	crypto-policies-scripts.noarch 20210209-1.gitbfb6bed.el8_3
curl.x86_64 7.61.1-14.el8_3.1				      |	curl.x86_64 7.61.1-18.el8
dbus.x86_64 1:1.12.8-12.el8_3				      |	dbus.x86_64 1:1.12.8-12.el8_4.2
dbus-common.noarch 1:1.12.8-12.el8_3			      |	dbus-common.noarch 1:1.12.8-12.el8_4.2
dbus-daemon.x86_64 1:1.12.8-12.el8_3			      |	dbus-daemon.x86_64 1:1.12.8-12.el8_4.2
dbus-libs.x86_64 1:1.12.8-12.el8_3			      |	dbus-libs.x86_64 1:1.12.8-12.el8_4.2
dbus-tools.x86_64 1:1.12.8-12.el8_3			      |	dbus-tools.x86_64 1:1.12.8-12.el8_4.2
dhcp-client.x86_64 12:4.3.6-41.el8			      |	dhcp-client.x86_64 12:4.3.6-44.el8
dhcp-common.noarch 12:4.3.6-41.el8			      |	dhcp-common.noarch 12:4.3.6-44.el8
dhcp-libs.x86_64 12:4.3.6-41.el8			      |	dhcp-libs.x86_64 12:4.3.6-44.el8
dmidecode.x86_64 1:3.2-6.el8				      |	dmidecode.x86_64 1:3.2-8.el8
dnf.noarch 4.2.23-4.el8					      |	dnf.noarch 4.4.2-11.el8
dnf-data.noarch 4.2.23-4.el8				      |	dnf-data.noarch 4.4.2-11.el8
dnf-plugin-subscription-manager.x86_64 1.27.18-1.el8_3	      |	dnf-plugin-subscription-manager.x86_64 1.28.13-2.el8
dnf-plugins-core.noarch 4.0.17-5.el8			      |	dnf-plugins-core.noarch 4.0.18-4.el8
elfutils-debuginfod-client.x86_64 0.180-1.el8		      |	elfutils-debuginfod-client.x86_64 0.182-3.el8
elfutils-default-yama-scope.noarch 0.180-1.el8		      |	elfutils-default-yama-scope.noarch 0.182-3.el8
elfutils-libelf.x86_64 0.180-1.el8			      |	elfutils-libelf.x86_64 0.182-3.el8
elfutils-libs.x86_64 0.180-1.el8			      |	elfutils-libs.x86_64 0.182-3.el8
ethtool.x86_64 2:5.0-2.el8				      |	ethtool.x86_64 2:5.8-5.el8
file.x86_64 5.33-16.el8					      |	file.x86_64 5.33-16.el8_3.1
file-libs.x86_64 5.33-16.el8				      |	file-libs.x86_64 5.33-16.el8_3.1
gawk.x86_64 4.2.1-1.el8					      |	gawk.x86_64 4.2.1-2.el8
glib2.x86_64 2.56.4-8.el8				      |	glib2.x86_64 2.56.4-9.el8
glibc.x86_64 2.28-127.el8_3.2				      |	glibc.x86_64 2.28-151.el8
glibc-common.x86_64 2.28-127.el8_3.2			      |	glibc-common.x86_64 2.28-151.el8
glibc-langpack-en.x86_64 2.28-127.el8_3.2		      |	glibc-langpack-en.x86_64 2.28-151.el8
gnutls.x86_64 3.6.14-7.el8_3				      |	gnutls.x86_64 3.6.14-8.el8_3
gpgme.x86_64 1.13.1-3.el8				      |	gpgme.x86_64 1.13.1-7.el8
grub2-common.noarch 1:2.02-90.el8			      |	grub2-common.noarch 1:2.02-99.el8
grub2-pc.x86_64 1:2.02-90.el8				      |	grub2-pc.x86_64 1:2.02-99.el8
grub2-pc-modules.noarch 1:2.02-90.el8			      |	grub2-pc-modules.noarch 1:2.02-99.el8
grub2-tools.x86_64 1:2.02-90.el8			      |	grub2-tools.x86_64 1:2.02-99.el8
grub2-tools-extra.x86_64 1:2.02-90.el8			      |	grub2-tools-extra.x86_64 1:2.02-99.el8
grub2-tools-minimal.x86_64 1:2.02-90.el8		      |	grub2-tools-minimal.x86_64 1:2.02-99.el8
hdparm.x86_64 9.54-2.el8				      |	hdparm.x86_64 9.54-3.el8
hwdata.noarch 0.314-8.6.el8				      |	hwdata.noarch 0.314-8.8.el8
ima-evm-utils.x86_64 1.1-5.el8				      |	ima-evm-utils.x86_64 1.3.2-12.el8
initscripts.x86_64 10.00.9-1.el8			      |	initscripts.x86_64 10.00.15-1.el8
insights-client.noarch 3.1.1-1.el8_3			      |	insights-client.noarch 3.1.3-2.el8_4
iproute.x86_64 5.3.0-5.el8				      |	iproute.x86_64 5.9.0-4.el8
iputils.x86_64 20180629-2.el8				      |	iputils.x86_64 20180629-7.el8
json-c.x86_64 0.13.1-0.2.el8				      |	json-c.x86_64 0.13.1-0.4.el8
kexec-tools.x86_64 2.0.20-34.el8_3.2			      |	kexec-tools.x86_64 2.0.20-46.el8
kmod.x86_64 25-16.el8_3.1				      |	kmod.x86_64 25-17.el8
kmod-libs.x86_64 25-16.el8_3.1				      |	kmod-libs.x86_64 25-17.el8
krb5-libs.x86_64 1.18.2-5.el8				      |	krb5-libs.x86_64 1.18.2-8.el8
libarchive.x86_64 3.3.2-9.el8				      |	libarchive.x86_64 3.3.3-1.el8
libblkid.x86_64 2.32.1-24.el8				      |	libblkid.x86_64 2.32.1-27.el8
libcomps.x86_64 0.1.11-4.el8				      |	libcomps.x86_64 0.1.11-5.el8
libcurl.x86_64 7.61.1-14.el8_3.1			      |	libcurl.x86_64 7.61.1-18.el8
libdb.x86_64 5.3.28-39.el8				      |	libdb.x86_64 5.3.28-40.el8
libdb-utils.x86_64 5.3.28-39.el8			      |	libdb-utils.x86_64 5.3.28-40.el8
libdnf.x86_64 0.48.0-5.el8				      |	libdnf.x86_64 0.55.0-7.el8
libfdisk.x86_64 2.32.1-24.el8				      |	libfdisk.x86_64 2.32.1-27.el8
libgcc.x86_64 8.3.1-5.1.el8				      |	libgcc.x86_64 8.4.1-1.el8
libgomp.x86_64 8.3.1-5.1.el8				      |	libgomp.x86_64 8.4.1-1.el8
libldb.x86_64 2.1.3-2.el8				      |	libldb.x86_64 2.2.0-2.el8
libmount.x86_64 2.32.1-24.el8				      |	libmount.x86_64 2.32.1-27.el8
libnfsidmap.x86_64 1:2.3.3-35.el8			      |	libnfsidmap.x86_64 1:2.3.3-41.el8
libpcap.x86_64 14:1.9.1-4.el8				      |	libpcap.x86_64 14:1.9.1-5.el8
libpwquality.x86_64 1.4.0-9.el8				      |	libpwquality.x86_64 1.4.4-3.el8
librepo.x86_64 1.12.0-2.el8				      |	librepo.x86_64 1.12.0-3.el8
librhsm.x86_64 0.0.3-3.el8				      |	librhsm.x86_64 0.0.3-4.el8
libseccomp.x86_64 2.4.3-1.el8				      |	libseccomp.x86_64 2.5.1-1.el8
libselinux.x86_64 2.9-4.el8_3				      |	libselinux.x86_64 2.9-5.el8
libselinux-utils.x86_64 2.9-4.el8_3			      |	libselinux-utils.x86_64 2.9-5.el8
libsemanage.x86_64 2.9-3.el8				      |	libsemanage.x86_64 2.9-6.el8
libsepol.x86_64 2.9-1.el8				      |	libsepol.x86_64 2.9-2.el8
libsmartcols.x86_64 2.32.1-24.el8			      |	libsmartcols.x86_64 2.32.1-27.el8
libsolv.x86_64 0.7.11-1.el8				      |	libsolv.x86_64 0.7.16-2.el8
libsss_autofs.x86_64 2.3.0-9.el8			      |	libsss_autofs.x86_64 2.4.0-9.el8
libsss_sudo.x86_64 2.3.0-9.el8				      |	libsss_sudo.x86_64 2.4.0-9.el8
libstdc++.x86_64 8.3.1-5.1.el8				      |	libstdc++.x86_64 8.4.1-1.el8
libuuid.x86_64 2.32.1-24.el8				      |	libuuid.x86_64 2.32.1-27.el8
libxml2.x86_64 2.9.7-8.el8				      |	libxml2.x86_64 2.9.7-9.el8
							      >	lmdb-libs.x86_64 0.9.24-1.el8
lshw.x86_64 B.02.19.2-2.el8				      |	lshw.x86_64 B.02.19.2-5.el8
lsscsi.x86_64 0.30-1.el8				      |	lsscsi.x86_64 0.32-2.el8
nettle.x86_64 3.4.1-2.el8				      |	nettle.x86_64 3.4.1-4.el8_3
oddjob.x86_64 0.34.5-3.el8				      |	oddjob.x86_64 0.34.7-1.el8
oddjob-mkhomedir.x86_64 0.34.5-3.el8			      |	oddjob-mkhomedir.x86_64 0.34.7-1.el8
openldap.x86_64 2.4.46-15.el8				      |	openldap.x86_64 2.4.46-16.el8
openssl.x86_64 1:1.1.1g-12.el8_3			      |	openssl.x86_64 1:1.1.1g-15.el8_3
openssl-libs.x86_64 1:1.1.1g-12.el8_3			      |	openssl-libs.x86_64 1:1.1.1g-15.el8_3
pam.x86_64 1.3.1-11.el8					      |	pam.x86_64 1.3.1-14.el8
pciutils.x86_64 3.6.4-2.el8				      |	pciutils.x86_64 3.7.0-1.el8
pciutils-libs.x86_64 3.6.4-2.el8			      |	pciutils-libs.x86_64 3.7.0-1.el8
platform-python.x86_64 3.6.8-31.el8			      |	platform-python.x86_64 3.6.8-37.el8
platform-python-pip.noarch 9.0.3-18.el8			      |	platform-python-pip.noarch 9.0.3-19.el8
popt.x86_64 1.16-14.el8					      |	popt.x86_64 1.18-1.el8
procps-ng.x86_64 3.3.15-3.el8				      |	procps-ng.x86_64 3.3.15-6.el8
python3-asn1crypto.noarch 0.24.0-3.el8			      <
python3-cryptography.x86_64 2.3-3.el8			      |	python3-cryptography.x86_64 3.2.1-4.el8
python3-dnf.noarch 4.2.23-4.el8				      |	python3-dnf.noarch 4.4.2-11.el8
python3-dnf-plugins-core.noarch 4.0.17-5.el8		      |	python3-dnf-plugins-core.noarch 4.0.18-4.el8
python3-gpg.x86_64 1.13.1-3.el8				      |	python3-gpg.x86_64 1.13.1-7.el8
python3-hawkey.x86_64 0.48.0-5.el8			      |	python3-hawkey.x86_64 0.55.0-7.el8
python3-libcomps.x86_64 0.1.11-4.el8			      |	python3-libcomps.x86_64 0.1.11-5.el8
python3-libdnf.x86_64 0.48.0-5.el8			      |	python3-libdnf.x86_64 0.55.0-7.el8
python3-librepo.x86_64 1.12.0-2.el8			      |	python3-librepo.x86_64 1.12.0-3.el8
python3-libs.x86_64 3.6.8-31.el8			      |	python3-libs.x86_64 3.6.8-37.el8
python3-libselinux.x86_64 2.9-4.el8_3			      |	python3-libselinux.x86_64 2.9-5.el8
python3-libsemanage.x86_64 2.9-3.el8			      |	python3-libsemanage.x86_64 2.9-6.el8
python3-libxml2.x86_64 2.9.7-8.el8			      |	python3-libxml2.x86_64 2.9.7-9.el8
python3-linux-procfs.noarch 0.6.2-2.el8			      |	python3-linux-procfs.noarch 0.6.3-1.el8
python3-magic.noarch 5.33-16.el8			      |	python3-magic.noarch 5.33-16.el8_3.1
python3-perf.x86_64 4.18.0-240.15.1.el8_3		      |	python3-perf.x86_64 4.18.0-305.el8
python3-pip-wheel.noarch 9.0.3-18.el8			      |	python3-pip-wheel.noarch 9.0.3-19.el8
python3-ply.noarch 3.9-8.el8				      |	python3-ply.noarch 3.9-9.el8
python3-rpm.x86_64 4.14.3-4.el8				      |	python3-rpm.x86_64 4.14.3-13.el8
python3-subscription-manager-rhsm.x86_64 1.27.18-1.el8_3      |	python3-subscription-manager-rhsm.x86_64 1.28.13-2.el8
python3-syspurpose.x86_64 1.27.18-1.el8_3		      |	python3-syspurpose.x86_64 1.28.13-2.el8
python3-unbound.x86_64 1.7.3-14.el8			      |	python3-unbound.x86_64 1.7.3-15.el8
python3-urllib3.noarch 1.24.2-4.el8			      |	python3-urllib3.noarch 1.24.2-5.el8
qemu-guest-agent.x86_64 15:4.2.0-34.module+el8.3.0+9828+7aab3 |	qemu-guest-agent.x86_64 15:4.2.0-48.module+el8.4.0+10368+630e
redhat-release.x86_64 8.3-1.0.el8			      |	redhat-release.x86_64 8.4-0.6.el8
redhat-release-eula.x86_64 8.3-1.0.el8			      |	redhat-release-eula.x86_64 8.4-0.6.el8
rh-amazon-rhui-client.noarch 3.0.39-1.el8		      |	rh-amazon-rhui-client.noarch 3.0.40-1.el8
rng-tools.x86_64 6.8-3.el8				      |	rhc.x86_64 1:0.1.4-1.el8_4
rpm.x86_64 4.14.3-4.el8					      |	rpm.x86_64 4.14.3-13.el8
rpm-build-libs.x86_64 4.14.3-4.el8			      |	rpm-build-libs.x86_64 4.14.3-13.el8
rpm-libs.x86_64 4.14.3-4.el8				      |	rpm-libs.x86_64 4.14.3-13.el8
rpm-plugin-selinux.x86_64 4.14.3-4.el8			      |	rpm-plugin-selinux.x86_64 4.14.3-13.el8
rpm-plugin-systemd-inhibit.x86_64 4.14.3-4.el8		      |	rpm-plugin-systemd-inhibit.x86_64 4.14.3-13.el8
rsyslog.x86_64 8.1911.0-6.el8				      |	rsyslog.x86_64 8.1911.0-7.el8
sqlite-libs.x86_64 3.26.0-11.el8			      |	sqlite-libs.x86_64 3.26.0-13.el8
squashfs-tools.x86_64 4.3-19.el8			      |	squashfs-tools.x86_64 4.3-20.el8
sssd-nfs-idmap.x86_64 2.3.0-9.el8			      |	sssd-nfs-idmap.x86_64 2.4.0-9.el8
subscription-manager.x86_64 1.27.18-1.el8_3		      |	subscription-manager.x86_64 1.28.13-2.el8
subscription-manager-rhsm-certificates.x86_64 1.27.18-1.el8_3 |	subscription-manager-rhsm-certificates.x86_64 1.28.13-2.el8
							      >	tpm2-tss.x86_64 2.3.2-3.el8
trousers.x86_64 0.3.14-4.el8				      |	trousers.x86_64 0.3.15-1.el8
trousers-lib.x86_64 0.3.14-4.el8			      |	trousers-lib.x86_64 0.3.15-1.el8
tuned.noarch 2.14.0-3.el8_3.2				      |	tuned.noarch 2.15.0-2.el8
unbound-libs.x86_64 1.7.3-14.el8			      |	unbound-libs.x86_64 1.7.3-15.el8
util-linux.x86_64 2.32.1-24.el8				      |	util-linux.x86_64 2.32.1-27.el8
yum.noarch 4.2.23-4.el8					      |	yum.noarch 4.4.2-11.el8
yum-utils.noarch 4.0.17-5.el8				      |	yum-utils.noarch 4.0.18-4.el8
zlib.x86_64 1.2.11-16.el8_2				      |	zlib.x86_64 1.2.11-17.el8

Comment 48 Eric Garver 2021-08-26 21:30:20 UTC

(In reply to Russell Teague from comment #47)
> What version of kernel and iptables should be installed on the 8.4 host?

8.3 GA: kernel-4.18.0-240.el8, 1.8.4-15.el8
8.4 GA: kernel-4.18.0-305.el8, 1.8.4-17.el8

Looks like all the servers are using the 8.4 packages. Can you downgrade to the 8.3 kernel and iptables?

Comment 49 Russell Teague 2021-08-27 17:07:10 UTC

Downgraded kernel/iptables to kernel-4.18.0-240.el8, 1.8.4-15.el8 and the target down alerts are still firing.

Comment 50 Eric Garver 2021-08-27 18:10:57 UTC

(In reply to Russell Teague from comment #49)
> Downgraded kernel/iptables to kernel-4.18.0-240.el8, 1.8.4-15.el8 and the
> target down alerts are still firing.

That's unexpected since comment 21 and comment 22 say this doesn't work on RHEL-8.3. Did you remember to reboot after the kernel downgrade?

Comment 51 Russell Teague 2021-08-27 18:24:28 UTC

Yes, I remembered to reboot.

[ec2-user@ip-10-0-153-162 ~]$ uname -a
Linux ip-10-0-153-162.us-east-2.compute.internal 4.18.0-240.el8.x86_64 #1 SMP Wed Sep 23 05:13:10 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
[ec2-user@ip-10-0-153-162 ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.4 (Ootpa)
[ec2-user@ip-10-0-153-162 ~]$ 


comment 21 and comment 22 say this _does_ work on RHEL-8.3

Comment 52 Eric Garver 2021-08-27 19:18:17 UTC

(In reply to Martin Kennelly from comment #30)
> > This statement contradicts comment 23. In comment 23 you said the package/kernel versions were the same. That shouldn't be the case. They should definitely have different kernel versions. RHEL-8.3 is kernel-4.18.0-240.el8.
> 
> I didn't understand when I wrote that comment that when I added a RHEL 8.3
> node to a OCP 4.9 cluster, OCP updates numerous components/packages,
> including the kernel version to match RHEL 8.4, so it works when OCP updates
> 8.3 -> 8.4 but doesn't work on RHEL 8.4. Therefore, a change in 8.4 that is
> not managed by OCP is breaking it.

I missed this detail. To make it clear:

RHEL-8.3: PASS
RHEL-8.4 upgraded from RHEL-8.3: PASS
RHEL-8.4: FAIL

Russell, the list you gave in comment 47 - is "8.3" in that comment really 8.4 upgraded from 8.3? If so, then that's the list of packages you can try downgrading.
Some that jump out to me:

  - curl
  - iproute
  - openssl

Alternatively, try doing a full `dnf update` on the nodes that are upgrade from RHEL-8.3 to RHEL-8.4. I find it odd that OCP isn't doing a full update and is only upgrading select packages.

Comment 53 Russell Teague 2021-08-27 20:05:32 UTC

RHEL-8.4 upgraded from RHEL-8.3: PASS <-- This has not been tested to my knowledge. (see below)

The statement "when OCP updates 8.3 -> 8.4" is not fully correct because only select packages are updated.  In comment 34 I provided a link to the list(s) of packages that are updated when installing OCP.  It is not within the scope of RHEL compute scaleup in OCP to update all the packages on the node.

In comment 47, 8.3 refers to an 8.3 host that had OCP installed and select packages updated.

I will test again with RHEL-8.3 hosts fully upgraded.  I will collect the package differences between these hosts.

Comment 54 Eric Garver 2021-08-27 20:22:38 UTC

(In reply to Russell Teague from comment #53)
> RHEL-8.4 upgraded from RHEL-8.3: PASS <-- This has not been tested to my
> knowledge. (see below)
> 
> The statement "when OCP updates 8.3 -> 8.4" is not fully correct because
> only select packages are updated.

That was my point. It's a partial upgrade. It's neither RHEL-8.3 nor RHEL-8.4.

I said "RHEL-8.4 upgraded from RHEL-8.3" but I meant "RHEL-8.3 partially upgraded to RHEL-8.4". Sorry for the confusion.

> In comment 34 I provided a link to the
> list(s) of packages that are updated when installing OCP.

That was useful to see.

> It is not within
> the scope of RHEL compute scaleup in OCP to update all the packages on the
> node.

I fail to see how a partial upgrade is in scope, but a full upgrade is out of scope.

Comment 55 Russell Teague 2021-08-30 17:54:05 UTC

I installed two openshift clusters, one with RHEL 8.3 workers and one with RHEL 8.4 workers.  Both 8.3 and 8.4 hosts were fully upgraded based on current released versions for all packages.

After installing openshift, these are the package differences between the two hosts:
Installed Packages (8.3-upgraded-to-8.4)	     |	Installed Packages (8.4)
						     >	NetworkManager-cloud-setup.x86_64 1:1.30.0-10.el8_4
grub2-tools-efi.x86_64 1:2.02-99.el8		     <
python3-asn1crypto.noarch 0.24.0-3.el8		     <
rh-amazon-rhui-client.noarch 3.0.39-1.el8	     |	rh-amazon-rhui-client.noarch 3.0.40-1.el8
						     >	rhc.x86_64 1:0.1.4-1.el8_4
rng-tools.x86_64 6.8-3.el8			     <


I confirmed the 8.3-upgraded-to-8.4 host was functioning properly with no openshift alerts firing.  The 8.4 host was presenting the same issue as described in this bug.

I will step through the packages above to make them the same as the upgraded 8.3 host to see if that has any effect on the issue, although I don't know why/how any of these would be related.

Comment 56 Russell Teague 2021-08-30 18:44:28 UTC

Found the culprit.  After uninstalling NetworkManager-cloud-setup, the problem went away.  I confirmed this by installing fresh RHEL 8.4 hosts, uninstalled NetworkManager-cloud-setup, rebooted, then installed openshift (worker scaleup) as normal.  After draining the RHCOS nodes and running all pods on the RHEL nodes there were no TargetDown alerts.

I need some help in tracking down what this package does, and why it is being included by default in RHEL 8.4 (at least in the public AWS AMI).

Here are a couple of links I came across while doing some research:
https://networkmanager.pages.freedesktop.org/NetworkManager/NetworkManager/nm-cloud-setup.html
https://github.com/coreos/fedora-coreos-tracker/issues/320

Comment 57 Russell Teague 2021-08-30 19:02:32 UTC

https://brew.engineering.redhat.com/brew/rpminfo?rpmID=9963538
Installs a nm-cloud-setup tool that can automatically configure
NetworkManager in cloud setups. Currently only EC2 is supported.
This tool is still experimental.


Package was built with this build:
NetworkManager-1.30.0-10.el8_4
https://brew.engineering.redhat.com/brew/buildinfo?buildID=1660864

Comment 58 Eric Garver 2021-08-30 20:52:53 UTC

Pinging the NM team. thaller, bgalvani, please see comment 56 and above if you need context.

tl;dr The presence of NetworkManager-cloud-setup causes unexpected TargetDown alarms because some monitoring service can not be reached.

Comment 59 Thomas Haller 2021-08-31 13:03:26 UTC

reassigning to NetworkManager.


RHEL-8.4 images for AWS enables nm-cloud-setup by default. The idea is to automatically configure networking.

Obviously, if that causes problems with containers / Openshift, that's a severe issue.

It is clear what nm-cloud-setup does. It does what is implemented. But it's less clear how that is wrong and what it should do instead.


Russel, as these are "just VMs", would it be easily possible to share access to such a VM that exhibits the problem?
Alternatively, could you please attach:

  ip -4 addr
  ip -6 addr
  ip -4 rule
  ip -6 rule
  ip -4 route show table all
  ip -6 route show table all

Comment 60 Russell Teague 2021-08-31 13:45:48 UTC

I'm building a cluster and will provide access as well as attach the requested command output.

Comment 65 Thomas Haller 2021-09-16 15:44:23 UTC

should be fixed upstream with https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/974

Comment 70 Thomas Haller 2021-09-21 17:04:02 UTC

here is a scratch build with the fix:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=39868720


Any chance to test it with RHCOS?

Comment 71 Russell Teague 2021-09-23 14:38:33 UTC

This important issue is affecting OpenShift clusters on AWS where RHEL 8.4 nodes are deployed.  RHEL 8.4 worker node support will be GA in OpenShift 4.9.

I have tested the scratch build by installing the following packages on RHEL 8.4 nodes.  I observed local node prometheus targets reporting correctly when the prometheus pod is running on the local node.  The original issue reported appears resolved.

    - NetworkManager-1.32.10-3.el8.x86_64.rpm
    - NetworkManager-cloud-setup-1.32.10-3.el8.x86_64.rpm
    - NetworkManager-libnm-1.32.10-3.el8.x86_64.rpm
    - NetworkManager-team-1.32.10-3.el8.x86_64.rpm
    - NetworkManager-tui-1.32.10-3.el8.x86_64.rpm
    - NetworkManager-ovs-1.32.10-3.el8.x86_64.rpm

Comment 77 Thomas Haller 2021-09-30 17:08:01 UTC

*** Bug 1995503 has been marked as a duplicate of this bug. ***

Comment 79 Vladimir Benes 2021-10-11 10:24:01 UTC

from Frank's email:
# mkdir -p /tmp/test
# echo 'testhah123' > /tmp/test/1
# cd /tmp/test
# podman run -dit --name my-apache-app -p 8080:80 -v "$PWD":/usr/local/apache2/htdocs/ httpd:2.4
# curl http://10.116.2.65:8080/1 (with NetworkManager-cloud-setup installed, curl failed, without nm-cloud, curl ok)
testhah123

With nm-cloud-setup enabled(NetworkManager-cloud-setup-1.30.0-10.el8_4.x86_64), below is route output:
[root@ip-10-116-2-65 test]# ip -4 route show table all|sort
10.116.2.0/24 dev eth0 proto kernel scope link src 10.116.2.65 metric 100 
10.116.2.1 dev eth0 table 30400 proto static scope link metric 10 
10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 
broadcast 10.116.2.0 dev eth0 table local proto kernel scope link src 10.116.2.65 
broadcast 10.116.2.255 dev eth0 table local proto kernel scope link src 10.116.2.65 
broadcast 10.88.0.0 dev cni-podman0 table local proto kernel scope link src 10.88.0.1 
broadcast 10.88.255.255 dev cni-podman0 table local proto kernel scope link src 10.88.0.1 
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
default via 10.116.2.1 dev eth0 proto dhcp metric 100 
default via 10.116.2.1 dev eth0 table 30400 proto static metric 10 
local 10.116.2.65 dev eth0 table local proto kernel scope host src 10.116.2.65 
local 10.88.0.1 dev cni-podman0 table local proto kernel scope host src 10.88.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1 
[root@ip-10-116-2-65 test]# curl http://10.116.2.65:8080/1
curl: (7) Failed to connect to 10.116.2.65 port 8080: Connection timed out

With nm-cloud-setup disabled:
#  ip -4 route show table all|sort
10.116.2.0/24 dev eth0 proto kernel scope link src 10.116.2.65 metric 100 
10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 
broadcast 10.116.2.0 dev eth0 table local proto kernel scope link src 10.116.2.65 
broadcast 10.116.2.255 dev eth0 table local proto kernel scope link src 10.116.2.65 
broadcast 10.88.0.0 dev cni-podman0 table local proto kernel scope link src 10.88.0.1 
broadcast 10.88.255.255 dev cni-podman0 table local proto kernel scope link src 10.88.0.1 
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
default via 10.116.2.1 dev eth0 proto dhcp metric 100 
local 10.116.2.65 dev eth0 table local proto kernel scope host src 10.116.2.65 
local 10.88.0.1 dev cni-podman0 table local proto kernel scope host src 10.88.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
[root@ip-10-116-2-65 test]# curl http://10.116.2.65:8080/1
testhah123

The behavior is the same as RHEL-8.4 in RHEL-8.5, but with nm-cloud-setup enabled in fixed version(NetworkManager-cloud-setup-1.32.10-4.el8.x86_64), below is route output:
[root@ip-10-116-2-122 test]# rpm -q NetworkManager-cloud-setup
NetworkManager-cloud-setup-1.32.10-4.el8.x86_64
[root@ip-10-116-2-122 test]# curl http://10.116.2.122:8080/1
testhaha
[root@ip-10-116-2-122 test]# ip -4 route show table all|sort
10.116.2.0/24 dev eth0 proto kernel scope link src 10.116.2.122 metric 100 
10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 
broadcast 10.116.2.0 dev eth0 table local proto kernel scope link src 10.116.2.122 
broadcast 10.116.2.255 dev eth0 table local proto kernel scope link src 10.116.2.122
broadcast 10.88.0.0 dev cni-podman0 table local proto kernel scope link src 10.88.0.1 
broadcast 10.88.255.255 dev cni-podman0 table local proto kernel scope link src 10.88.0.1 
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
default via 10.116.2.1 dev eth0 proto dhcp metric 100 
local 10.116.2.122 dev eth0 table local proto kernel scope host src 10.116.2.122 
local 10.88.0.1 dev cni-podman0 table local proto kernel scope host src 10.88.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1

Comment 82 errata-xmlrpc 2021-11-09 19:30:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: NetworkManager security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4361

Note You need to log in before you can comment on or make changes to this bug.

aconstan
amuller
anpicker
aos-bugs
astoycos
atragler
bgalvani
dustymabe
egarver
erooth
fge
lrintel
mkennell
pgough
rkhan
rrajaram
snemec
spasquie
sthaha
sukulkar
thaller
till
todoleza
tpelka
tsze
vbenes
xiliang