Bug 1612702
| Summary: | Failed to create egress router pod due to add route failed when enabling macvlan | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Meng Bo <bmeng> |
| Component: | Networking | Assignee: | Casey Callendrello <cdc> |
| Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.11.0 | CC: | aos-bugs, danw, dmace, pasik, rpenta, weliang |
| Target Milestone: | --- | ||
| Target Release: | 3.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-10-11 07:23:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
@cdc @dcbw "file exists" error suggests that the added route is already present. I think we need to ignore the error in this case or alternatively add the route only if it doesn't exists. Rajat found what is probably the root cause - https://github.com/openshift/origin/pull/20115 Either way, the answer is to ignore "already exists" errors - working on a patch now. Fix is in https://github.com/openshift/origin/pull/20601 Bo, can you test this without that PR merging? I wasn't able to reproduce the fix locally. (In reply to Casey Callendrello from comment #8) > Fix is in https://github.com/openshift/origin/pull/20601 > > Bo, can you test this without that PR merging? I wasn't able to reproduce > the fix locally. I have tried on latest OCP 3.11 build, after rebuild the sdn-cni-plugin with the fix, rename it to openshift-sdn and replace the one under /opt/cni/bin/. The egress router can be created. Fix is merged. Tested on build v3.11.0-0.17.0, issue has been fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652 |
Description of problem: When creating egress router, the pod failed to start and the following error appears in the node log: Aug 06 14:14:05 ocp311-node.bmeng.local atomic-openshift-node[23785]: E0806 14:14:05.272902 23785 cni.go:260] Error adding network: failed to add route to dst: 10.66.140.77/32 via SDN: file exists Aug 06 14:14:05 ocp311-node.bmeng.local atomic-openshift-node[23785]: E0806 14:14:05.272939 23785 cni.go:228] Error while adding to cni network: failed to add route to dst: 10.66.140.77/32 via SDN: file exists Aug 06 14:14:05 ocp311-node.bmeng.local atomic-openshift-node[23785]: E0806 14:14:05.693932 23785 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69" network for pod "egress-1": NetworkPlugin cni failed to set up pod "egress-1_bmengp1" network: failed to add route to dst: 10.66.140.77/32 via SDN: file exists Aug 06 14:14:05 ocp311-node.bmeng.local atomic-openshift-node[23785]: E0806 14:14:05.694035 23785 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)" failed: rpc error: code = Unknown desc = failed to set up sandbox container "5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69" network for pod "egress-1": NetworkPlugin cni failed to set up pod "egress-1_bmengp1" network: failed to add route to dst: 10.66.140.77/32 via SDN: file exists Aug 06 14:14:05 ocp311-node.bmeng.local atomic-openshift-node[23785]: E0806 14:14:05.694061 23785 kuberuntime_manager.go:646] createPodSandbox for pod "egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)" failed: rpc error: code = Unknown desc = failed to set up sandbox container "5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69" network for pod "egress-1": NetworkPlugin cni failed to set up pod "egress-1_bmengp1" network: failed to add route to dst: 10.66.140.77/32 via SDN: file exists Aug 06 14:14:05 ocp311-node.bmeng.local atomic-openshift-node[23785]: E0806 14:14:05.694105 23785 pod_workers.go:186] Error syncing pod b664fcf3-993e-11e8-836b-5254005ce8d4 ("egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)"), skipping: failed to "CreatePodSandbox" for "egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)" with CreatePodSandboxError: "CreatePodSandbox for pod \"egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)\" failed: rpc error: code = Unknown desc = failed to set up sandbox container \"5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69\" network for pod \"egress-1\": NetworkPlugin cni failed to set up pod \"egress-1_bmengp1\" network: failed to add route to dst: 10.66.140.77/32 via SDN: file exists" Aug 06 14:14:06 ocp311-node.bmeng.local atomic-openshift-node[23785]: W0806 14:14:06.182346 23785 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "egress-1_bmengp1": CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69" Aug 06 14:14:06 ocp311-node.bmeng.local atomic-openshift-node[23785]: I0806 14:14:06.192920 23785 kubelet.go:1869] SyncLoop (PLEG): "egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)", event: &pleg.PodLifecycleEvent{ID:"b664fcf3-993e-11e8-836b-5254005ce8d4", Type:"ContainerDied", Data:"5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69"} Aug 06 14:14:06 ocp311-node.bmeng.local atomic-openshift-node[23785]: W0806 14:14:06.193004 23785 pod_container_deletor.go:75] Container "5ac751d27f6b2376a8ff44c6c3195239af77272a348c194048a1714d06135c69" not found in pod's containers Aug 06 14:14:06 ocp311-node.bmeng.local atomic-openshift-node[23785]: I0806 14:14:06.493566 23785 kuberuntime_manager.go:403] No ready sandbox for pod "egress-1_bmengp1(b664fcf3-993e-11e8-836b-5254005ce8d4)" can be found. Need to start a new one Version-Release number of selected component (if applicable): v3.11.0-0.11.0 How reproducible: always Steps to Reproduce: 1. Create project via user and give the privileged scc to the user's service account 2. Create egress router pod with the template below 3. Actual results: The egress router pod keeps in ContainerCreating status and cannot be created successfully. Expected results: The pod creation succeeded. Additional info: $ cat egressrouter.yaml apiVersion: v1 kind: Pod metadata: name: egress-1 labels: name: egress-1 annotations: pod.network.openshift.io/assign-macvlan: "true" spec: containers: - name: egress-router image: $registry/openshift3/ose-egress-router:v3.11 securityContext: privileged: true env: - name: EGRESS_SOURCE value: 10.66.140.200 - name: EGRESS_GATEWAY value: 10.66.141.254 - name: EGRESS_DESTINATION value: 61.135.218.24 - name: EGRESS_ROUTER_MODE value: legacy