Bug 1488954 - oc adm router --expose-metrics fails by default
Summary: oc adm router --expose-metrics fails by default
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.5.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.7.0
Assignee: Ben Bennett
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks: 1495540
TreeView+ depends on / blocked
 
Reported: 2017-09-06 13:50 UTC by Steven Walter
Modified: 2022-08-04 22:20 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The dependent image changed the argument string slightly (without maintaining backwards compatibility) Consequence: The default configuration generated breaks with the newer image version Fix: Change the argument string to match the expected one Result: It now works
Clone Of:
: 1495540 (view as bug list)
Environment:
Last Closed: 2017-11-28 22:09:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Origin (Github) 16284 0 None None None 2017-09-11 15:39:31 UTC
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Steven Walter 2017-09-06 13:50:05 UTC
Description of problem:
Running the router deployment with default command fails.

oc adm router --expose-metrics

The generated dc is not compatible with new image. This is described here:
https://github.com/openshift/origin/issues/15982


Version-Release number of selected component (if applicable):
image: 'registry.access.redhat.com/openshift3/ose-haproxy-router:v3.5.5.8'
image: 'prom/haproxy-exporter:latest'



How reproducible:
Reproduced easily

Steps to Reproduce:
# oc adm router --expose-metrics

Actual results:
16s        2m          13        router-1-deploy   Pod                   Warning   FailedSync   {kubelet node-0.sisyphus.example.com}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"invalid header field value \\\"oci runtime error: container_linux.go:247: starting container process caused \\\\\\\"exec: \\\\\\\\\\\\\\\"/pod\\\\\\\\\\\\\\\": stat /pod: no such file or directory\\\\\\\"\\\\n\\\"\"}"



Expected results:
Router starts up

Additional info:

One environment seemed to work around the issue by manually adding a '-' character in --haproxy.scrape which was previously -haproxy.scrape

        - name: metrics-exporter
          image: 'prom/haproxy-exporter:latest'
          args:
            - >-
              --haproxy.scrape-uri=http://$(STATS_USERNAME):$(STATS_PASSWORD)@localhost:$(STATS_PORT)/haproxy?stats;csv

Comment 1 zhaozhanqi 2017-09-07 03:09:54 UTC
this bug happen due to the change from '-haproxy.scrapa-uri' to '--haproxy.scrapa-uri' in haproxy_exporter, see:
https://github.com/prometheus/haproxy_exporter/commit/cb544ab842c398242fc5e22053da49dda3058aa0#diff-04c6e90faac2675aa89e2176d2eec7d8

Comment 2 openshift-github-bot 2017-09-13 22:05:45 UTC
Commits pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/8573e88f77207f38abd59b70ab0c4fff9a61d632
Handle the changed --haproxy.scrape-uri argument (- to --)

The argument changed from -haproxy.scrape to --haproxy.scrape.  This
makes 'oc adm --expose-metrics' handle the change.

Fixes bug 1488954 (https://bugzilla.redhat.com/show_bug.cgi?id=1488954)

https://github.com/openshift/origin/commit/06c50ed9b3f5cce85ea5d3b3f5f4f474cdae6b17
Merge pull request #16284 from knobunc/bug/bz1488954-fix-scrape-arg

Automatic merge from submit-queue (batch tested with PRs 16150, 16284, 16296, 16071)

Handle the changed --haproxy.scrape-uri argument (- to --)

The argument changed from -haproxy.scrape to --haproxy.scrape.  This
makes 'oc adm --expose-metrics' handle the change.

Fixes bug 1488954 (https://bugzilla.redhat.com/show_bug.cgi?id=1488954)

Comment 3 zhaozhanqi 2017-09-15 01:39:32 UTC
hi, Ben Bennett

this fixed only for 3.7, do we need to backport this to previous version?

Comment 4 Ben Bennett 2017-09-15 13:36:27 UTC
@zhaozhanqi: Good question.  I'm not sure... because the image we depend on changed, it's hard to say what images people will have.

Anyway, I don't think this feature is widely used, and now we have added better stats handling to the router itself, this should not be used on haproxy backed routers anyway.

Comment 7 zhaozhanqi 2017-09-26 10:01:32 UTC
verified this bug on v3.7.0-0.127.0
using oc adm router --expose-metrics to deploy router.it can work well.

Comment 8 Phil Cameron 2017-10-04 17:43:40 UTC
The problem is:
image: 'prom/haproxy-exporter:latest'
The new version of 'prom/haproxy-exporter', which will run on all existing routers, requires --haproxy.scrape-uri. Previously created routers break when using the new version.  So when they made the change, they broke all existing routers that use the sidecar.

The side car is deprecated by the integrated prometheus scraper and will be deleted in a future release. Until then it is safest to specify a specific version in the image: parameter instead of 'latest'.

From: https://github.com/openshift/origin/issues/15982

prom/haproxy-exporter:v0.7.1 is for the -haproxy.scrape-uri

and

prom/haproxy-exporter:v0.8.0 is for the --haproxy.scrape-uri

For existing routers that have this problem its best to change 'image:' to 
image: prom/haproxy-exporter:v0.7.1

You can edit the dc as needed.

Comment 12 errata-xmlrpc 2017-11-28 22:09:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.