Bug 2083616

Summary: AWS, GCP, and Azure cilium installs are failing e2e tests
Product: OpenShift Container Platform Reporter: Stephen Benjamin <stbenjam>
Component: NetworkingAssignee: Nate Sweet <nathan.sweet>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED DEFERRED Docs Contact:
Severity: medium    
Priority: medium CC: anbhat, astoycos, errordeveloper, ffernand, mkennell, nathan.sweet, sippy, vrutkovs, zzhao
Version: 4.9   
Target Milestone: ---   
Target Release: 4.13.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2007580 Environment:
job=periodic-ci-openshift-release-master-ci-4.9-e2e-azure-cilium=all job=periodic-ci-openshift-release-master-ci-4.10-e2e-azure-cilium=all
Last Closed: 2023-02-22 09:53:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2007580    
Bug Blocks:    

Comment 2 Nate Sweet 2022-10-10 14:17:49 UTC
To conform to the sig-network tests the following CiliumConfig (which is not the default for installing for reasons stated below) needs to be used:

```
apiVersion: cilium.io/v1alpha1
kind: CiliumConfig
metadata:
  name: cilium
  namespace: cilium
spec:
  debug:
    enabled: true
  k8s:
    requireIPv4PodCIDR: true
  pprof:
    enabled: true
  logSystemLoad: true
  bpf:
    preallocateMaps: true
  etcd:
    leaseTTL: 30s
  ipv4:
    enabled: true
  ipv6:
    enabled: true
  identityChangeGracePeriod: 0s
  ipam:
    mode: "cluster-pool"
    operator:
      clusterPoolIPv4PodCIDR: "10.128.0.0/14"
      clusterPoolIPv4MaskSize: "23"
  nativeRoutingCIDR: "10.128.0.0/14"
  endpointRoutes: {enabled: true}
  kubeProxyReplacement: "probe"
  clusterHealthPort: 9940
  tunnelPort: 4789
  cni:
    binPath: "/var/lib/cni/bin"
    confPath: "/var/run/multus/cni/net.d"
    chainingMode: portMap
  prometheus:
    serviceMonitor: {enabled: false}
  hubble:
    tls: {enabled: false}
```

Note that `identityChangeGracePeriod: 0s` does not scale for production environments.

Finally, the following 2 tests will fail no matter what:
1.NetworkPolicy between server and client should ensure an IP overlapping both IPBlock.CIDR and IPBlock.Except is allowed (Cilium does not allow CIDR blocks to define internal traffic and only supports identity based policy mapping for internal traffic)
2.NetworkPolicy between server and client should not allow access by TCP when a policy specifies only SCTP (Cilium does not support SCTP).

Comment 3 Martin Kennelly 2023-01-16 16:38:25 UTC
Hi @nathan.sweet,

We are clearing our bug backlog and wish to hopefully resolve this issue.
Couple of questions.

Do you plan to fix this? If no, please close this issue.
The test cases mentioned are located in "origin" repository right? If so, please set the component to origin.
Since this is not an issue with Openshift SDN, but its being tracked as an issue of Openshift SDN, is there any possibility you can track any potential fix with a jira issue and component set to Cilium?

Comment 4 Martin Kennelly 2023-01-25 15:26:29 UTC
Nate, I will close this issue in two weeks if there are no responses. Thank you for your time.

Comment 5 Nate Sweet 2023-01-30 17:59:35 UTC
> Do you plan to fix this? If no, please close this issue.

I don't know how to implement the suggested fix. I don't know where this CI exists.

Comment 6 Martin Kennelly 2023-01-31 14:34:33 UTC
@nathan.sweet Can you reach out to me on kubernetes slack (mkennell) and Ill show you where the tests are and answer any questions that I can.

Comment 7 Nate Sweet 2023-02-06 09:07:01 UTC
Submitted PR https://github.com/openshift/release/pull/36066 to fix.

Comment 8 Martin Kennelly 2023-02-15 13:06:26 UTC
@nathan.sweet Hey Nate, I see the Cilium CI is failing even with your PR. Do you have a timeline for when you can look at this?

Comment 9 Martin Kennelly 2023-02-22 09:53:23 UTC
Talked to Nate regarding this issue. We agreed this issue should be closed.
We dont have a component for cilium. This issue is opened against openshift-sdn.
There is no need for a bug anyway because the fix is going into master.

Comment 10 Red Hat Bugzilla 2023-09-18 04:36:53 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days