Description of problem: IPsec runtime enabling not work in hypershift Version-Release number of selected component (if applicable): 4.11.0-0.nightly-2022-05-20-213928 How reproducible: Every time Steps to Reproduce: #### Hypershift [weliang@weliang bin]$ oc patch networks.operator.openshift.io cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipsecConfig":{ }}}}}' network.operator.openshift.io/cluster patched [weliang@weliang bin]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-05-20-213928 True False 122m Cluster version is 4.11.0-0.nightly-2022-05-20-213928 [weliang@weliang ~]$ oc debug node/ip-10-0-138-246.us-east-2.compute.internal [root@ip-10-0-138-246 /]# tcpdump -i br-ex | grep ESP dropped privs to tcpdump tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br-ex, link-type EN10MB (Ethernet), capture size 262144 bytes #### OCP dual-stack cluster [weliang@weliang bin]$ oc patch networks.operator.openshift.io cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipsecConfig":{ }}}}}' network.operator.openshift.io/cluster patched [weliang@weliang bin]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-05-25-193227 True False 142m Cluster version is 4.11.0-0.nightly-2022-05-25-193227 [weliang@weliang bin]$ oc debug node/worker-00.weliang-5272.qe.devcluster.openshift.com [root@worker-00 /]# tcpdump -i br-ex | grep ESP 17:22:49.425809 IP worker-00.weliang-5272.qe.devcluster.openshift.com > master-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0x8966e58c,seq=0x3e4), length 188 17:22:49.425873 IP worker-00.weliang-5272.qe.devcluster.openshift.com > master-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0x8966e58c,seq=0x3e5), length 448 17:22:49.425938 IP master-00.weliang-5272.qe.devcluster.openshift.com > worker-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0xdd8582b6,seq=0x447), length 160 17:22:49.426070 IP master-00.weliang-5272.qe.devcluster.openshift.com > worker-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0xdd8582b6,seq=0x448), length 160 17:22:49.426108 IP worker-00.weliang-5272.qe.devcluster.openshift.com > master-00.weliang-5272.qe.devcluster.openshift.com: ESP(spi=0x8966e58c,seq=0x3e6), length 124 Actual results: "tcpdump -i br-ex | grep ESP" return no packets in hypershift Expected results: "tcpdump -i br-ex | grep ESP" should return packets in hypershift Additional info:
This is because the `ovn-keys` init container in the `ovn-ipsec` DS fails due to incorrect rbac: ``` + kubectl delete --ignore-not-found=true csr/ip-10-0-133-131 Error from server (Forbidden): certificatesigningrequests.certificates.k8s.io "ip-10-0-133-131" is forbidden: User "system:serviceaccount:openshift-ovn-kubernetes:ovn-kubernetes-node" cannot delete resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope ``` This seems to be caused by https://github.com/openshift/cluster-network-operator/pull/1450 which moved the CSR management permissions from a ClusterRole to a Role. I can see in the above output that the OCP version you used for Hypershift is newer than the one for the dual stack cluster, which explains why the Hypershift cluster has this issue, despite it not being caused by Hypershift itself. Reassigning this to the networking team.
yes not Hypershift specific, I hit the same issue in an arm cluster with ipsec enabled. ocp version: 4.11.0-0.nightly-arm64-2022-05-31-155531 ++ hostname + kubectl delete --ignore-not-found=true csr/master-02.lwan-38983.qeclusters.arm.eng.rdu2.redhat.com Error from server (Forbidden): certificatesigningrequests.certificates.k8s.io "master-02.lwan-38983.qeclusters.arm.eng.rdu2.redhat.com" is forbidden: User "system:serviceaccount:openshift-ovn-kubernetes:ovn-kubernetes-node" cannot delete resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope reason: Error
Tested and verified in 4.11.0-0.nightly-2022-06-04-014713 [root@weliang-662-9cw5x-worker-a-49wrg /]# tcpdump -i br-ex | grep ESP dropped privs to tcpdump tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br-ex, link-type EN10MB (Ethernet), capture size 262144 bytes 15:19:35.772883 IP weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal > weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal: ESP(spi=0xc8b587e4,seq=0x120), length 164 15:19:35.773608 IP weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal > weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal: ESP(spi=0x95fbb5c3,seq=0x132), length 124 15:19:35.776499 IP weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal > weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal: ESP(spi=0x95fbb5c3,seq=0x133), length 184 15:19:35.776585 IP weliang-662-9cw5x-worker-a-49wrg.c.openshift-qe.internal > weliang-662-9cw5x-worker-c-vtqbv.c.openshift-qe.internal: ESP(spi=0x95fbb5c3,seq=0x134), length 1432
*** Bug 2093393 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069