Bug 1549936
| Summary: | prometheus-node-exporter pods are in ImagePullBackOff status, don't have v0.15.2 image | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Junqi Zhao <juzhao> |
| Component: | Hawkular | Assignee: | Aaron Weitekamp <aweiteka> |
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.9.0 | CC: | acomabon, aos-bugs, aweiteka, hongkliu, jialiu, jupierce, juzhao, mifiedle, pep, pgier, sdodson, wkulhane, xtian |
| Target Milestone: | --- | ||
| Target Release: | 3.9.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-05-29 21:42:51 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Also we need make installer to pull v3.7 image as default image tag. Sorry(In reply to Johnny Liu from comment #1) > Also we need make installer to pull v3.7 image as default image tag. Sorry, s/v3.7/v3.9/ So I'm clear, the priority issue is the "v0.15.2" (not available) vs "0.15.2" (available) tag? Here's the proper fix for openshift-ansible: https://github.com/openshift/openshift-ansible/pull/7325 Pending internal tagging by pgier Remove TestBlocker keyword, issue is fixed. prometheus-node-exporter also pushed to reg-aws repo. image: registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter:v0.15.2 Please change to ON_QA Set to VERIFIED as per Comment 14 Set to VERIFIED, since this issue is about prometheus-node-exporter don't have v0.15.2 tag, as for prometheus-node-exporter should have v3.9 tag or not, will be considered in future, see Comment 17. Will open v3.9 tag is missign if we decide prometheus-node-exporter should have v3.9 tag (In reply to Junqi Zhao from comment #18) > Will open v3.9 tag is missign if we decide prometheus-node-exporter should > have v3.9 tag change to Will open a defect about v3.9 tag is missing if we decide prometheus-node-exporter should have v3.9 tag prometheus-node-exporter has v3.9 tag now
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3033 100 3033 0 0 31343 0 --:--:-- --:--:-- --:--:-- 31593
"v3.9": "fe144ca99bbe32e539fe8869293500ff4dabdf7c43b498c0c05586f87c166b37",
"v3.9.10": "d923e17649b8ba65515193b64f8e0fe15b23b5a7b1ea0acede1397509ac0a094",
"v3.9.10-1": "d923e17649b8ba65515193b64f8e0fe15b23b5a7b1ea0acede1397509ac0a094",
"v3.9.10.20180315.142724": "d923e17649b8ba65515193b64f8e0fe15b23b5a7b1ea0acede1397509ac0a094",
"v3.9.11": "54799bb6d52c7852b69d20c9982b78b004c3e7cfe2f37b462dcae645b94b30c6",
"v3.9.11-1": "54799bb6d52c7852b69d20c9982b78b004c3e7cfe2f37b462dcae645b94b30c6",
"v3.9.11.20180315.181300": "54799bb6d52c7852b69d20c9982b78b004c3e7cfe2f37b462dcae645b94b30c6",
"v3.9.12": "fe144ca99bbe32e539fe8869293500ff4dabdf7c43b498c0c05586f87c166b37",
"v3.9.12-1": "fe144ca99bbe32e539fe8869293500ff4dabdf7c43b498c0c05586f87c166b37",
"v3.9.12.20180319.095352": "fe144ca99bbe32e539fe8869293500ff4dabdf7c43b498c0c05586f87c166b37",
"v3.9.8": "7ddf429234a77e029845dc5a6407df33e7980efd3ebb8512d59a73b2fa9358e4",
"v3.9.8-1": "7ddf429234a77e029845dc5a6407df33e7980efd3ebb8512d59a73b2fa9358e4",
"v3.9.8.20180313.172024": "7ddf429234a77e029845dc5a6407df33e7980efd3ebb8512d59a73b2fa9358e4",
"v3.9.9": "ca3c74c435f1b05251ccc6654e7cda0d7cc1acc53d39d18fdc62f6bd22c0108c",
"v3.9.9-1": "ca3c74c435f1b05251ccc6654e7cda0d7cc1acc53d39d18fdc62f6bd22c0108c",
"v3.9.9.20180314.185428": "ca3c74c435f1b05251ccc6654e7cda0d7cc1acc53d39d18fdc62f6bd22c0108c"
Since this is still in the 3.9(.14) GA Code the fix is to add this line to the /etc/ansible/hosts file: openshift_prometheus_node_exporter_image_version=v3.9 The fix from comment #5 was backported to the 3.9 branch: https://github.com/openshift/openshift-ansible/pull/7673 which means you shouldn't need to workaround by explicitly setting a tag - it would pick the release tag by default. The backport didn't make it in time for GA though, but I assume it will be part of a future update. The fix for this is in openshift-ansible-3.9.27-1 and later. |
Description of problem: After deployed prometheus,prometheus-node-exporter pods are in ImagePullBackOff status # oc get po NAME READY STATUS RESTARTS AGE prometheus-0 6/6 Running 0 1h prometheus-node-exporter-fqpwf 0/1 ImagePullBackOff 0 1h prometheus-node-exporter-jzfvb 0/1 ImagePullBackOff 0 1h Described prometheus-node-exporter pods, Warning Failed 1h (x4 over 1h) kubelet, 172.16.120.101 Failed to pull image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus-node-exporter:v0.15.2": rpc error: code = Unknown desc = error parsing HTTP 404 response body: invalid character '<' looking for beginning of value: "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /pulp/docker/v2/redhat-openshift3-prometheus-node-exporter/manifests/v0.15.2 was not found on this server.</p>\n</body></html>\n" Warning Failed 1h (x4 over 1h) kubelet, 172.16.120.101 Error: ErrImagePull Warning Failed 6m (x410 over 1h) kubelet, 172.16.120.101 Error: ImagePullBackOff Normal BackOff 1m (x431 over 1h) kubelet, 172.16.120.101 Back-off pulling image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus-node-exporter:v0.15.2" # docker pull registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter:v0.15.2 Trying to pull repository registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter ... manifest for registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter:v0.15.2 not found don't have v0.15.2 image, and not sure prometheus-node-exporter should have v3.9 tag like other prometheus iamges # curl -X GET -k brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/v1/repositories/openshift3/prometheus-node-exporter/tags | python -m json.tool % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 646 100 646 0 0 5146 0 --:--:-- --:--:-- --:--:-- 5168 { "0.14.0": "44f72f459d624931baace1ccfc50ff81f69cdba4bcaad18e5f87f92507571d01", "0.14.0-1": "44f72f459d624931baace1ccfc50ff81f69cdba4bcaad18e5f87f92507571d01", "0.15.2": "2590ebd50e53e243ae57ecaca44a9f4410ba7d468622eec649bebbd953949acf", "0.15.2-1": "2590ebd50e53e243ae57ecaca44a9f4410ba7d468622eec649bebbd953949acf", "latest": "2590ebd50e53e243ae57ecaca44a9f4410ba7d468622eec649bebbd953949acf", "rhaos-3.7-rhel-7-docker-candidate-21542-20170906205309": "44f72f459d624931baace1ccfc50ff81f69cdba4bcaad18e5f87f92507571d01", "rhaos-3.9-rhel-7-docker-candidate-76868-20180131230619": "2590ebd50e53e243ae57ecaca44a9f4410ba7d468622eec649bebbd953949acf" } Version-Release number of selected component (if applicable): # openshift version openshift v3.9.1 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.16 How reproducible: Always Steps to Reproduce: 1. Deploy prometheus 2. 3. Actual results: prometheus-node-exporter pods are in ImagePullBackOff status Expected results: prometheus-node-exporter pods should be healthy Additional info: # Deploy prometheus openshift_prometheus_state=present openshift_prometheus_node_selector={'role': 'node'} openshift_prometheus_image_prefix=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/