Bug 1838251

Summary: [Multus] error calling DHCP.Allocate
Product: OpenShift Container Platform Reporter: Weibin Liang <weliang>
Component: NetworkingAssignee: Douglas Smith <dosmith>
Networking sub component: multus QA Contact: Weibin Liang <weliang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: bbennett, dosmith, tohayash
Version: 4.5   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 17:40:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1851489    

Description Weibin Liang 2020-05-20 18:19:16 UTC
Description of problem:
Pod secondary interface can not get IP address from local DHCP server.
Same test case failed after around 5/12 image.

Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-05-20-053050

How reproducible:
Always

Steps to Reproduce:
Follow steps from https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-24466

Actual results:
[weliang@weliang verification-tests]$ oc get pod
NAME                                          READY   STATUS              RESTARTS   AGE
ip-10-0-60-30us-east-2computeinternal-debug   1/1     Running             0          14m
pod-macvlan-private-ipam-dhcp                 0/1     ContainerCreating   0          2m59s
[weliang@weliang verification-tests]$ oc describe pod pod-macvlan-private-ipam-dhcp
Name:         pod-macvlan-private-ipam-dhcp
Namespace:    test
Priority:     0
Node:         ip-10-0-60-30.us-east-2.compute.internal/10.0.60.30
Start Time:   Wed, 20 May 2020 14:44:03 -0400
Labels:       <none>
Annotations:  k8s.v1.cni.cncf.io/networks: testmacvlan
              openshift.io/scc: anyuid
Status:       Pending
IP:           
IPs:          <none>
Containers:
  pod-macvlan-private-ipam-dhcp:
    Container ID:  
    Image:         dougbtv/centos-network
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
      sleep 2000000000000
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-gxbxv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-gxbxv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-gxbxv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                From                                               Message
  ----     ------                  ----               ----                                               -------
  Normal   Scheduled               <unknown>          default-scheduler                                  Successfully assigned test/pod-macvlan-private-ipam-dhcp to ip-10-0-60-30.us-east-2.compute.internal
  Normal   AddedInterface          2m59s              multus                                             Add eth0 [10.129.2.174/23]
  Warning  FailedCreatePodSandBox  2m58s              kubelet, ip-10-0-60-30.us-east-2.compute.internal  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_pod-macvlan-private-ipam-dhcp_test_2f65bbd6-e04b-4d6c-a981-ddec90ae3791_0(d3344efecfe06f6441a37a9a98cf16fb45f396e3ceb1937375bbcca5402f24aa): Multus: [test/pod-macvlan-private-ipam-dhcp]: error adding container to network "testmacvlan": delegateAdd: error invoking confAdd - "macvlan": error in getting result from AddNetwork: error calling DHCP.Allocate: failed to Statfs "/host/var/run/netns/791a8a71-25a0-411a-b92f-25495ad58997": no such file or directory


Expected results:
Pod should get IP address for secondary interface 

Additional info:

Comment 2 Douglas Smith 2020-05-21 20:31:31 UTC
Tomofumi Hayashi identified the root cause that the DHCP CNI daemon was not properly mounting the /run/netns. 

However, the change in requirement for this mount point is unknown at this time. 

Fix provided in https://github.com/openshift/cluster-network-operator/pull/645 implements that fix.

Comment 5 Weibin Liang 2020-05-28 20:26:32 UTC
Tested and verified in 4.5.0-0.nightly-2020-05-28-090106

[weliang@weliang ~]$ oc get pods
NAME                                            READY   STATUS    RESTARTS   AGE
ip-10-0-129-29us-east-2computeinternal-debug    1/1     Running   0          9m4s
ip-10-0-214-213us-east-2computeinternal-debug   1/1     Running   0          8m51s
pod-macvlan-private-ipam-dhcp                   1/1     Running   0          22s
[weliang@weliang ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-05-28-090106   True        False         31m     Cluster version is 4.5.0-0.nightly-2020-05-28-090106
[weliang@weliang ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-05-28-090106   True        False         31m     Cluster version is 4.5.0-0.nightly-2020-05-28-090106
[weliang@weliang ~]$ oc get project | grep ovn
openshift-ovn-kubernetes                                          Active
[weliang@weliang ~]$

Comment 6 errata-xmlrpc 2020-07-13 17:40:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409