Hide Forgot
Description of problem: Router is broken after the installation. Version-Release number of selected component (if applicable): $ oc adm release info --pullspecs registry.svc.ci.openshift.org/ocp/release:4.0.0-0.nightly-2019-01-11-205323 | grep installer installer registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-205323@sha256:58b5bc0f10caa359d520b7ee2cf695b60c1971d3c141abe99d33b8e024ef114f How reproducible: 1/1 Steps to Reproduce: 1. check the pod after installation 2. 3. Actual results: Expected results: Additional info: $ oc logs -n openshift-ingress router-default-86f48b66c4-5fsrn I0111 21:31:04.467607 1 template.go:299] Starting template router (v4.0.0-0.136.0) error: open /var/lib/haproxy/conf/haproxy-config.template: no such file or directory $ oc describe pod -n openshift-ingress router-default-86f48b66c4-5fsrn Name: router-default-86f48b66c4-5fsrn Namespace: openshift-ingress Priority: 2000000000 PriorityClassName: system-cluster-critical Node: ip-10-0-162-93.us-east-2.compute.internal/10.0.162.93 Start Time: Fri, 11 Jan 2019 21:25:00 +0000 Labels: app=router pod-template-hash=4290462270 router=router-default Annotations: <none> Status: Running IP: 10.128.2.4 Controlled By: ReplicaSet/router-default-86f48b66c4 Containers: router: Container ID: cri-o://c7813f3f1c56b48809547c884908b55c7f589287502559046e2849e537ce1e9c Image: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-205323@sha256:6ede9cb0b73dc9df975b35822667662c21eacb725fba37dc20ab5f2327b12818 Image ID: registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-205323@sha256:6ede9cb0b73dc9df975b35822667662c21eacb725fba37dc20ab5f2327b12818 Ports: 80/TCP, 443/TCP, 1936/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Fri, 11 Jan 2019 21:31:04 +0000 Finished: Fri, 11 Jan 2019 21:31:04 +0000 Ready: False Restart Count: 6 Liveness: http-get http://:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http://:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Environment: STATS_PORT: 1936 ROUTER_SERVICE_NAMESPACE: openshift-ingress DEFAULT_CERTIFICATE_DIR: /etc/pki/tls/private ROUTER_SERVICE_NAME: default ROUTER_CANONICAL_HOSTNAME: apps.hongkliu.qe.devcluster.openshift.com Mounts: /etc/pki/tls/private from default-certificate (ro) /var/run/secrets/kubernetes.io/serviceaccount from router-token-ktp67 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: default-certificate: Type: Secret (a volume populated by a Secret) SecretName: router-certs-default Optional: false router-token-ktp67: Type: Secret (a volume populated by a Secret) SecretName: router-token-ktp67 Optional: false QoS Class: BestEffort Node-Selectors: node-role.kubernetes.io/worker= Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 7m default-scheduler Successfully assigned openshift-ingress/router-default-86f48b66c4-5fsrn to ip-10-0-162-93.us-east-2.compute.internal Warning FailedMount 7m kubelet, ip-10-0-162-93.us-east-2.compute.internal MountVolume.SetUp failed for volume "default-certificate" : secrets "router-certs-default" not found Normal Pulling 7m kubelet, ip-10-0-162-93.us-east-2.compute.internal pulling image "registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-205323@sha256:6ede9cb0b73dc9df975b35822667662c21eacb725fba37dc20ab5f2327b12818" Normal Pulled 7m kubelet, ip-10-0-162-93.us-east-2.compute.internal Successfully pulled image "registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-205323@sha256:6ede9cb0b73dc9df975b35822667662c21eacb725fba37dc20ab5f2327b12818" Normal Created 6m (x4 over 7m) kubelet, ip-10-0-162-93.us-east-2.compute.internal Created container Normal Started 6m (x4 over 7m) kubelet, ip-10-0-162-93.us-east-2.compute.internal Started container Normal Pulled 5m (x4 over 7m) kubelet, ip-10-0-162-93.us-east-2.compute.internal Container image "registry.svc.ci.openshift.org/ocp/4.0-art-latest-2019-01-11-205323@sha256:6ede9cb0b73dc9df975b35822667662c21eacb725fba37dc20ab5f2327b12818" already present on machine Warning BackOff 2m (x27 over 7m) kubelet, ip-10-0-162-93.us-east-2.compute.internal Back-off restarting failed container
This report is filed against a broken build: https://openshift-release.svc.ci.openshift.org 4.0.0-0.nightly-2019-01-11-205323 Rejected (VerificationFailed) 1 hour ago e2e-aws e2e-aws-serial I'll leave it open for now. Clayton and ART are working on fixing the build. But it doesn't seem like we should be testing broken builds.
tested with 4.0.0-0.nightly-2019-01-12-000105 and issue has been fixed. # oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-01-12-000105 True False 4h Cluster version is 4.0.0-0.nightly-2019-01-12-000105 # oc get pod -n openshift-ingress NAME READY STATUS RESTARTS AGE router-default-77994b7b7-2n8g8 1/1 Running 0 2h # oc -n openshift-ingress logs router-default-77994b7b7-2n8g8 I0114 06:21:24.693186 1 template.go:299] Starting template router (v4.0.0-0.136.0) I0114 06:21:24.735370 1 router.go:482] Router reloaded: - Checking http://localhost:80 ... - Health check ok : 0 retry attempt(s). I0114 06:21:24.735407 1 router.go:255] Router is including routes in all namespaces
Thanks, Hongli. It works for me too. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-01-12-000105 True False 1m Cluster version is 4.0.0-0.nightly-2019-01-12-000105 [fedora@ip-172-31-32-37 20190114]$ oc get pod -n openshift-ingress NAME READY STATUS RESTARTS AGE router-default-6f5b8695d7-g2k54 1/1 Running 0 5m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758