Bug 1664953

Summary: router pods can not be started, no /usr/bin/openshift-router
Product: OpenShift Container Platform Reporter: Yadan Pei <yapei>
Component: NetworkingAssignee: Dan Mace <dmace>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: high CC: aos-bugs, mifiedle, scheng, xxia, yapei
Version: 4.1.0Keywords: TestBlocker
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:41:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yadan Pei 2019-01-10 06:50:44 UTC
Description of problem:
router pods are not running in and error given:
/usr/bin/openshift-router: no such file or directory

Version-Release number of selected component (if applicable):
$ oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.nightly-2019-01-10-005204   True        False         2h        Cluster version is 4.0.0-0.nightly-2019-01-10-005204
$ oc get clusterversion  version -o yaml | grep -i payload
    payload: registry.svc.ci.openshift.org/ocp/release@sha256:de71c95dbce7a6e284e719a86140a7914ce2e82a2db1607810d3300d989f8d4c


How reproducible:
Always

Steps to Reproduce:
1. Check router default pod status
$ oc get pods -n openshift-ingress
NAME                              READY     STATUS                 RESTARTS   AGE
router-default-6fbbfc5dc9-f5npk   0/1       CreateContainerError   0          15m
$ oc get pods -n openshift-ingress -o yaml | grep image
      image: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-10-005204@sha256:7b4b2a0441d4f122ea82aad1455bcbb36e936a54d2d5e8db07539078d6b047ba
      imagePullPolicy: IfNotPresent
    imagePullSecrets:
    - image: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-10-005204@sha256:7b4b2a0441d4f122ea82aad1455bcbb36e936a54d2d5e8db07539078d6b047ba
      imageID: ""
  containerStatuses:
  - image: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-10-005204@sha256:7b4b2a0441d4f122ea82aad1455bcbb36e936a54d2d5e8db07539078d6b047ba
    imageID: ""
    lastState: {}
    name: router
    ready: false
    restartCount: 0
    state:
      waiting:
        message: |
          container create failed: container_linux.go:336: starting container process caused "exec: \"/usr/bin/openshift-router\": stat /usr/bin/openshift-router: no such file or directory"
        reason: CreateContainerError
  hostIP: 10.0.151.81
  phase: Pending
  podIP: 10.128.2.49
  qosClass: BestEffort
  startTime: 2019-01-10T06:15:54Z


Actual results:
1. router pods is in CreateContainerError status

Expected results:
1. router pods should be running

Additional info:

Comment 1 Mike Fiedler 2019-01-10 14:31:26 UTC
Marking this TestBlocker since it blocks access to all routes, including the console.

Comment 2 Dan Mace 2019-01-10 14:34:03 UTC
How was this cluster created?

Comment 5 Hongan Li 2019-01-11 08:57:40 UTC
Tested with latest build and the issue has been fixed. 

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE     STATUS
version   4.0.0-0.nightly-2019-01-11-000044   True        False         48m       Cluster version is 4.0.0-0.nightly-2019-01-11-000044

$ oc get clusterversion version -o yaml | grep -i payload
    payload: registry.svc.ci.openshift.org/ocp/release@sha256:4d4b4f0c64c5a27aeaeb43a5c27ba0b0a2b4c13bad7e290d712bdfd5793da7f2

$ oc get pods -n openshift-ingress
NAME                            READY     STATUS    RESTARTS   AGE
router-default-bb6dfcb9-779x2   1/1       Running   0          52m

$ oc get pods -n openshift-ingress -o yaml | grep image
      image: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-000044@sha256:9f7426e0e6a8b20d5b4af94f498e832985de36e2f06eadd2d92d652d198446ea


$ curl http://service-unsecure-hongli.apps.qe-jialiu3.qe.devcluster.openshift.com
Hello-OpenShift-1 http-8080

Comment 9 errata-xmlrpc 2019-06-04 10:41:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758