Bug 2091167 - IPsec runtime enabling not work in hypershift
Summary: IPsec runtime enabling not work in hypershift
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Mohamed Mahmoud
QA Contact: Anurag saxena
URL:
Whiteboard:
: 2093393 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-27 17:27 UTC by Weibin Liang
Modified: 2022-08-10 11:14 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:14:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1463 0 None open Bug 2091167: incorrectly setting rbac role for certificatesigningrequests 2022-05-27 19:24:20 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:14:57 UTC

Description Weibin Liang 2022-05-27 17:27:10 UTC
Description of problem:
IPsec runtime enabling not work in hypershift

Version-Release number of selected component (if applicable):
4.11.0-0.nightly-2022-05-20-213928

How reproducible:
Every time

Steps to Reproduce:
#### Hypershift
[weliang@weliang bin]$ oc patch networks.operator.openshift.io cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipsecConfig":{ }}}}}'
network.operator.openshift.io/cluster patched
[weliang@weliang bin]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-05-20-213928   True        False         122m    Cluster version is 4.11.0-0.nightly-2022-05-20-213928
[weliang@weliang ~]$ oc debug node/ip-10-0-138-246.us-east-2.compute.internal
[root@ip-10-0-138-246 /]# tcpdump  -i br-ex | grep ESP    
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-ex, link-type EN10MB (Ethernet), capture size 262144 bytes


#### OCP dual-stack cluster
[weliang@weliang bin]$ oc patch networks.operator.openshift.io cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipsecConfig":{ }}}}}'
network.operator.openshift.io/cluster patched
[weliang@weliang bin]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-05-25-193227   True        False         142m    Cluster version is 4.11.0-0.nightly-2022-05-25-193227
[weliang@weliang bin]$ oc debug node/worker-00.weliang-5272.qe.devcluster.openshift.com
[root@worker-00 /]# tcpdump  -i br-ex | grep ESP
17:22:49.425809 IP worker-00.weliang-5272.qe.devcluster.openshift.com > master-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0x8966e58c,seq=0x3e4), length 188
17:22:49.425873 IP worker-00.weliang-5272.qe.devcluster.openshift.com > master-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0x8966e58c,seq=0x3e5), length 448
17:22:49.425938 IP master-00.weliang-5272.qe.devcluster.openshift.com > worker-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0xdd8582b6,seq=0x447), length 160
17:22:49.426070 IP master-00.weliang-5272.qe.devcluster.openshift.com > worker-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0xdd8582b6,seq=0x448), length 160
17:22:49.426108 IP worker-00.weliang-5272.qe.devcluster.openshift.com > master-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0x8966e58c,seq=0x3e6), length 124


Actual results:
"tcpdump  -i br-ex | grep ESP" return no packets in hypershift

Expected results:
"tcpdump  -i br-ex | grep ESP" should return packets in hypershift

Additional info:

Comment 1 aaleman 2022-05-27 17:51:19 UTC
This is because the `ovn-keys` init container in the `ovn-ipsec` DS fails due to incorrect rbac:

```
+ kubectl delete --ignore-not-found=true csr/ip-10-0-133-131
Error from server (Forbidden): certificatesigningrequests.certificates.k8s.io "ip-10-0-133-131" is forbidden: User "system:serviceaccount:openshift-ovn-kubernetes:ovn-kubernetes-node" cannot delete resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
```

This seems to be caused by https://github.com/openshift/cluster-network-operator/pull/1450 which moved the CSR management permissions from a ClusterRole to a Role. I can see in the above output that the OCP version you used for Hypershift is newer than the one for the dual stack cluster, which explains why the Hypershift cluster has this issue, despite it not being caused by Hypershift itself.

Reassigning this to the networking team.

Comment 3 wang lin 2022-06-01 10:21:22 UTC
yes not Hypershift specific, I hit the same issue in an arm cluster with ipsec enabled. 

ocp version: 4.11.0-0.nightly-arm64-2022-05-31-155531


          ++ hostname
          + kubectl delete --ignore-not-found=true csr/master-02.lwan-38983.qeclusters.arm.eng.rdu2.redhat.com
          Error from server (Forbidden): certificatesigningrequests.certificates.k8s.io "master-02.lwan-38983.qeclusters.arm.eng.rdu2.redhat.com" is forbidden: User "system:serviceaccount:openshift-ovn-kubernetes:ovn-kubernetes-node" cannot delete resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
        reason: Error

Comment 6 Weibin Liang 2022-06-06 15:21:02 UTC
Tested and verified in 4.11.0-0.nightly-2022-06-04-014713

[root@weliang-662-9cw5x-worker-a-49wrg /]# tcpdump  -i br-ex | grep ESP
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-ex, link-type EN10MB (Ethernet), capture size 262144 bytes
15:19:35.772883 IP weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal > weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal: ESP(spi=0xc8b587e4,seq=0x120), length 164
15:19:35.773608 IP weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal > weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal: ESP(spi=0x95fbb5c3,seq=0x132), length 124
15:19:35.776499 IP weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal > weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal: ESP(spi=0x95fbb5c3,seq=0x133), length 184
15:19:35.776585 IP weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal > weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal: ESP(spi=0x95fbb5c3,seq=0x134), length 1432

Comment 7 Mike Fiedler 2022-06-07 13:01:04 UTC
*** Bug 2093393 has been marked as a duplicate of this bug. ***

Comment 9 errata-xmlrpc 2022-08-10 11:14:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.