Bug 1659183
| Summary: | Director deployed OCP 3.11: prometheus-k8s-0 pod fails to start due to nonexistent image - Failed to pull image "192.168.24.1:8787/openshift3/prometheus:v3.11.51-2": rpc error: code = Unknown desc = Error: image openshift3/prometheus:v3.11.51-2 not found | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
| Component: | openstack-tripleo-common | Assignee: | Martin André <m.andre> |
| Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 14.0 (Rocky) | CC: | dbecker, lmarsh, ltomasbo, m.andre, mburns, morazi, pgrist, psahoo, slinaber |
| Target Milestone: | z2 | Keywords: | Triaged, ZStream |
| Target Release: | 14.0 (Rocky) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-common-9.4.1-0.20190119050434.261de49.el7ost | Doc Type: | Known Issue |
| Doc Text: |
Director and openshift-ansible have different expectations regarding image tags. For example, when importing the remote container images locally, director converts the generic tag into one that uniquely identifies the image based on the `version` and `release` labels from the image metadata. Openshift-ansible, however, relies on a unique `openshift_image_tag` variable for all the openshift images tags making it impossible to specify tags of images individually. Deployment of OCP via director fails when the floating v3.11 tag in the remote container image registry points to images with non-consistent `release` or `version` labels in their metadata.
From the undercloud, import the odd images prior to deploying OpenShift and set the tag to be consistent across all openshift images:
skopeo --tls-verify=false copy docker://registry.access.redhat.com/openshift3/prometheus:v3.11.51-1 docker://192.168.24.1:8787/openshift3/prometheus:v3.11.51-2
Deployment of OpenShift from director completes without missing image.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-04-30 17:51:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
On undercloud we have v3.11.51-1 tag for the prometheus image. (undercloud) [stack@undercloud-0 ~]$ docker images | grep prometheus 192.168.24.1:8787/openshift3/ose-prometheus-operator v3.11.51-2 d24ce5e6f296 9 days ago 582 MB 192.168.24.1:8787/openshift3/ose-prometheus-config-reloader v3.11.51-2 e8c3cd83fd6e 9 days ago 510 MB 192.168.24.1:8787/openshift3/prometheus v3.11.51-1 c99a294ec062 9 days ago 283 MB 192.168.24.1:8787/openshift3/prometheus-node-exporter v3.11.51-2 52f02f543117 9 days ago 225 MB 192.168.24.1:8787/openshift3/prometheus-alertmanager v3.11.51-2 cf43629ba0d3 9 days ago 236 MB There is no way currently in openshift-ansible to use a different tag for the prometheus image, all the image must have the same tag corresponding to the value of openshift_image_tag. We either need to bump the tag for the prometheus image in the registry [1] from v3.11.51-1 to v3.11.51-2 to match the other openshift images, or as a workaround, stop setting the "tag_from_label" in ContainerImagePrepare. However, I'd be cautious with the workaround as this is going to mess up with the updates for the haproxy and keepalived services managed by tripleo. A better workaround would be to re-tag the prometheus in the local registry: $ docker tag 192.168.24.1:8787/openshift3/prometheus:v3.11.51-1 192.168.24.1:8787/openshift3/prometheus:v3.11.51-2 $ docker push 192.168.24.1:8787/openshift3/prometheus:v3.11.51-2 [1] https://access.redhat.com/containers/?tab=tags#/registry.access.redhat.com/openshift3/prometheus Adding a workaround that worked for me before triggering the overcloud deploy: docker pull registry.access.redhat.com/openshift3/prometheus:v3.11.51-1 skopeo --tls-verify=false copy docker://registry.access.redhat.com/openshift3/prometheus:v3.11.51-1 docker://192.168.24.1:8787/openshift3/prometheus:v3.11.51-2 (In reply to Marius Cornea from comment #3) > Adding a workaround that worked for me before triggering the overcloud > deploy: > > docker pull registry.access.redhat.com/openshift3/prometheus:v3.11.51-1 > skopeo --tls-verify=false copy > docker://registry.access.redhat.com/openshift3/prometheus:v3.11.51-1 > docker://192.168.24.1:8787/openshift3/prometheus:v3.11.51-2 Small correction, only the skopeo command is needed as workaround: skopeo --tls-verify=false copy docker://registry.access.redhat.com/openshift3/prometheus:v3.11.51-1 docker://192.168.24.1:8787/openshift3/prometheus:v3.11.51-2 *** Bug 1680523 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0878 |
Description of problem: Director deployed OCP 3.11: prometheus-k8s-0 pod fails to start due to nonexistent image - Failed to pull image "192.168.24.1:8787/openshift3/prometheus:v3.11.51-2": rpc error: code = Unknown desc = Error: image openshift3/prometheus:v3.11.51-2 not found Version-Release number of selected component (if applicable): openstack-tripleo-common-9.4.1-0.20181012010884.el7ost.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060904.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud OCP with tag_from_label: '{version}-{release}' in containers-prepare-parameter.yaml 2. Check pods status on one of the master nodes Actual results: [root@openshift-master-0 heat-admin]# oc get pods --all-namespaces | grep -v Running NAMESPACE NAME READY STATUS RESTARTS AGE openshift-monitoring prometheus-k8s-0 3/4 ImagePullBackOff 0 1h Expected results: All pods are running Additional info: [root@openshift-master-0 heat-admin]# oc describe pods prometheus-k8s-0 --namespace openshift-monitoring Name: prometheus-k8s-0 Namespace: openshift-monitoring Priority: 0 PriorityClassName: <none> Node: openshift-infra-1/172.17.1.15 Start Time: Thu, 13 Dec 2018 12:10:35 -0500 Labels: app=prometheus controller-revision-hash=prometheus-k8s-85dbf9b49 prometheus=k8s statefulset.kubernetes.io/pod-name=prometheus-k8s-0 Annotations: openshift.io/scc=restricted Status: Pending IP: 10.128.2.3 Controlled By: StatefulSet/prometheus-k8s Containers: prometheus: Container ID: Image: 192.168.24.1:8787/openshift3/prometheus:v3.11.51-2 Image ID: Port: <none> Host Port: <none> Args: --web.console.templates=/etc/prometheus/consoles --web.console.libraries=/etc/prometheus/console_libraries --config.file=/etc/prometheus/config_out/prometheus.env.yaml --storage.tsdb.path=/prometheus --storage.tsdb.retention=15d --web.enable-lifecycle --storage.tsdb.no-lockfile --web.external-url=https://prometheus-k8s-openshift-monitoring.apps.openshift.localdomain/ --web.route-prefix=/ --web.listen-address=127.0.0.1:9090 State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Environment: <none> Mounts: /etc/prometheus/config_out from config-out (ro) /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw) /etc/prometheus/secrets/prometheus-k8s-htpasswd from secret-prometheus-k8s-htpasswd (ro) /etc/prometheus/secrets/prometheus-k8s-proxy from secret-prometheus-k8s-proxy (ro) /etc/prometheus/secrets/prometheus-k8s-tls from secret-prometheus-k8s-tls (ro) /prometheus from prometheus-k8s-db (rw) /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-dxh9n (ro) prometheus-config-reloader: Container ID: docker://9319f9bf097126539c2824504c72c028990e209f60d1322c487488dfa262ad57 Image: 192.168.24.1:8787/openshift3/ose-prometheus-config-reloader:v3.11.51-2 Image ID: docker-pullable://192.168.24.1:8787/openshift3/ose-prometheus-config-reloader@sha256:f84f7ba5ad7e0a580937a1bec773011b2f15dd5508bfc38eda52732ffadb61a1 Port: <none> Host Port: <none> Command: /bin/prometheus-config-reloader Args: --log-format=logfmt --reload-url=http://localhost:9090/-/reload --config-file=/etc/prometheus/config/prometheus.yaml --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml State: Running Started: Thu, 13 Dec 2018 12:10:47 -0500 Ready: True Restart Count: 0 Limits: cpu: 10m memory: 50Mi Requests: cpu: 10m memory: 50Mi Environment: POD_NAME: prometheus-k8s-0 (v1:metadata.name) Mounts: /etc/prometheus/config from config (rw) /etc/prometheus/config_out from config-out (rw) /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-dxh9n (ro) prometheus-proxy: Container ID: docker://c54446ae210d5cee517c8f1e817b259c0a7c3b44595b3176c660bcefbb8602d7 Image: 192.168.24.1:8787/openshift3/oauth-proxy:v3.11.51-2 Image ID: docker-pullable://192.168.24.1:8787/openshift3/oauth-proxy@sha256:c7da086516ddb13e986af396882f2ce771ab5892eefd16c514ebd0785b0f0370 Port: 9091/TCP Host Port: 0/TCP Args: -provider=openshift -https-address=:9091 -http-address= -email-domain=* -upstream=http://localhost:9090 -htpasswd-file=/etc/proxy/htpasswd/auth -openshift-service-account=prometheus-k8s -openshift-sar={"resource": "namespaces", "verb": "get"} -openshift-delegate-urls={"/": {"resource": "namespaces", "verb": "get"}} -tls-cert=/etc/tls/private/tls.crt -tls-key=/etc/tls/private/tls.key -client-secret-file=/var/run/secrets/kubernetes.io/serviceaccount/token -cookie-secret-file=/etc/proxy/secrets/session_secret -openshift-ca=/etc/pki/tls/cert.pem -openshift-ca=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt -skip-auth-regex=^/metrics State: Running Started: Thu, 13 Dec 2018 12:10:48 -0500 Ready: True Restart Count: 0 Environment: <none> Mounts: /etc/proxy/htpasswd from secret-prometheus-k8s-htpasswd (rw) /etc/proxy/secrets from secret-prometheus-k8s-proxy (rw) /etc/tls/private from secret-prometheus-k8s-tls (rw) /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-dxh9n (ro) rules-configmap-reloader: Container ID: docker://02b2f7dde9557b4db5af6e61d645ba148315048153aabd0767b2a90100697e6f Image: 192.168.24.1:8787/openshift3/ose-configmap-reloader:v3.11.51-2 Image ID: docker-pullable://192.168.24.1:8787/openshift3/ose-configmap-reloader@sha256:3e2f688074eae0671f71cfb1561307b18c1d638d4657a431872022cc045b0c9d Port: <none> Host Port: <none> Args: --webhook-url=http://localhost:9090/-/reload --volume-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0 State: Running Started: Thu, 13 Dec 2018 12:10:54 -0500 Ready: True Restart Count: 0 Limits: cpu: 5m memory: 10Mi Requests: cpu: 5m memory: 10Mi Environment: <none> Mounts: /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw) /var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-dxh9n (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: config: Type: Secret (a volume populated by a Secret) SecretName: prometheus-k8s Optional: false config-out: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: prometheus-k8s-rulefiles-0: Type: ConfigMap (a volume populated by a ConfigMap) Name: prometheus-k8s-rulefiles-0 Optional: false secret-prometheus-k8s-tls: Type: Secret (a volume populated by a Secret) SecretName: prometheus-k8s-tls Optional: false secret-prometheus-k8s-proxy: Type: Secret (a volume populated by a Secret) SecretName: prometheus-k8s-proxy Optional: false secret-prometheus-k8s-htpasswd: Type: Secret (a volume populated by a Secret) SecretName: prometheus-k8s-htpasswd Optional: false prometheus-k8s-db: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: prometheus-k8s-token-dxh9n: Type: Secret (a volume populated by a Secret) SecretName: prometheus-k8s-token-dxh9n Optional: false QoS Class: Burstable Node-Selectors: node-role.kubernetes.io/infra=true Tolerations: node.kubernetes.io/memory-pressure:NoSchedule Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1h default-scheduler Successfully assigned openshift-monitoring/prometheus-k8s-0 to openshift-infra-1 Normal Pulling 1h kubelet, openshift-infra-1 pulling image "192.168.24.1:8787/openshift3/ose-prometheus-config-reloader:v3.11.51-2" Normal Pulled 1h kubelet, openshift-infra-1 Successfully pulled image "192.168.24.1:8787/openshift3/ose-prometheus-config-reloader:v3.11.51-2" Normal Created 1h kubelet, openshift-infra-1 Created container Normal Started 1h kubelet, openshift-infra-1 Started container Normal Pulling 1h kubelet, openshift-infra-1 pulling image "192.168.24.1:8787/openshift3/oauth-proxy:v3.11.51-2" Normal Created 1h kubelet, openshift-infra-1 Created container Normal Pulling 1h kubelet, openshift-infra-1 pulling image "192.168.24.1:8787/openshift3/ose-configmap-reloader:v3.11.51-2" Normal Started 1h kubelet, openshift-infra-1 Started container Normal Pulled 1h kubelet, openshift-infra-1 Successfully pulled image "192.168.24.1:8787/openshift3/oauth-proxy:v3.11.51-2" Normal Pulled 1h kubelet, openshift-infra-1 Successfully pulled image "192.168.24.1:8787/openshift3/ose-configmap-reloader:v3.11.51-2" Normal Created 1h kubelet, openshift-infra-1 Created container Normal Started 1h kubelet, openshift-infra-1 Started container Normal BackOff 1h (x2 over 1h) kubelet, openshift-infra-1 Back-off pulling image "192.168.24.1:8787/openshift3/prometheus:v3.11.51-2" Warning Failed 1h (x3 over 1h) kubelet, openshift-infra-1 Failed to pull image "192.168.24.1:8787/openshift3/prometheus:v3.11.51-2": rpc error: code = Unknown desc = Error: image openshift3/prometheus:v3.11.51-2 not found Normal Pulling 1h (x3 over 1h) kubelet, openshift-infra-1 pulling image "192.168.24.1:8787/openshift3/prometheus:v3.11.51-2" Warning Failed 1h (x3 over 1h) kubelet, openshift-infra-1 Error: ErrImagePull Warning Failed 4s (x433 over 1h) kubelet, openshift-infra-1 Error: ImagePullBackOff