Description of problem: Pod secondary interface can not get IP address from local DHCP server. Same test case failed after around 5/12 image. Version-Release number of selected component (if applicable): 4.5.0-0.nightly-2020-05-20-053050 How reproducible: Always Steps to Reproduce: Follow steps from https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-24466 Actual results: [weliang@weliang verification-tests]$ oc get pod NAME READY STATUS RESTARTS AGE ip-10-0-60-30us-east-2computeinternal-debug 1/1 Running 0 14m pod-macvlan-private-ipam-dhcp 0/1 ContainerCreating 0 2m59s [weliang@weliang verification-tests]$ oc describe pod pod-macvlan-private-ipam-dhcp Name: pod-macvlan-private-ipam-dhcp Namespace: test Priority: 0 Node: ip-10-0-60-30.us-east-2.compute.internal/10.0.60.30 Start Time: Wed, 20 May 2020 14:44:03 -0400 Labels: <none> Annotations: k8s.v1.cni.cncf.io/networks: testmacvlan openshift.io/scc: anyuid Status: Pending IP: IPs: <none> Containers: pod-macvlan-private-ipam-dhcp: Container ID: Image: dougbtv/centos-network Image ID: Port: <none> Host Port: <none> Command: /bin/bash -c sleep 2000000000000 State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-gxbxv (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: default-token-gxbxv: Type: Secret (a volume populated by a Secret) SecretName: default-token-gxbxv Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <unknown> default-scheduler Successfully assigned test/pod-macvlan-private-ipam-dhcp to ip-10-0-60-30.us-east-2.compute.internal Normal AddedInterface 2m59s multus Add eth0 [10.129.2.174/23] Warning FailedCreatePodSandBox 2m58s kubelet, ip-10-0-60-30.us-east-2.compute.internal Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_pod-macvlan-private-ipam-dhcp_test_2f65bbd6-e04b-4d6c-a981-ddec90ae3791_0(d3344efecfe06f6441a37a9a98cf16fb45f396e3ceb1937375bbcca5402f24aa): Multus: [test/pod-macvlan-private-ipam-dhcp]: error adding container to network "testmacvlan": delegateAdd: error invoking confAdd - "macvlan": error in getting result from AddNetwork: error calling DHCP.Allocate: failed to Statfs "/host/var/run/netns/791a8a71-25a0-411a-b92f-25495ad58997": no such file or directory Expected results: Pod should get IP address for secondary interface Additional info:
Tomofumi Hayashi identified the root cause that the DHCP CNI daemon was not properly mounting the /run/netns. However, the change in requirement for this mount point is unknown at this time. Fix provided in https://github.com/openshift/cluster-network-operator/pull/645 implements that fix.
Tested and verified in 4.5.0-0.nightly-2020-05-28-090106 [weliang@weliang ~]$ oc get pods NAME READY STATUS RESTARTS AGE ip-10-0-129-29us-east-2computeinternal-debug 1/1 Running 0 9m4s ip-10-0-214-213us-east-2computeinternal-debug 1/1 Running 0 8m51s pod-macvlan-private-ipam-dhcp 1/1 Running 0 22s [weliang@weliang ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-05-28-090106 True False 31m Cluster version is 4.5.0-0.nightly-2020-05-28-090106 [weliang@weliang ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-05-28-090106 True False 31m Cluster version is 4.5.0-0.nightly-2020-05-28-090106 [weliang@weliang ~]$ oc get project | grep ovn openshift-ovn-kubernetes Active [weliang@weliang ~]$
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409