Bug 1638658

Summary: [3.9] endpoint for alertmamager and alert-buffer gave HTTP response to HTTPS client
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Paul Gier <pgier>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: minden
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
: 1639082 (view as bug list) Environment:
Last Closed: 2018-12-13 19:27:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1639082    
Attachments:
Description Flags
endpoint for alertmamager and alert-buffer are down none

Description Junqi Zhao 2018-10-12 07:59:03 UTC
Created attachment 1493168 [details]
endpoint for alertmamager and alert-buffer are down

Description of problem:
Deploy prometheus v3.9.45-1

# oc -n openshift-metrics get pod -o wide
NAME                             READY     STATUS    RESTARTS   AGE       IP               NODE
prometheus-0                     6/6       Running   0          3h        10.2.2.4         share3-wmengr76o39-master-etcd-2
prometheus-node-exporter-25v67   1/1       Running   0          3h        192.168.100.14   share3-wmengr76o39-nrri-1
prometheus-node-exporter-9v6gs   1/1       Running   0          3h        192.168.100.12   share3-wmengr76o39-master-etcd-3
prometheus-node-exporter-bkn67   1/1       Running   0          3h        192.168.100.20   share3-wmengr76o39-node-primary-3
prometheus-node-exporter-d9wfc   1/1       Running   0          3h        192.168.100.8    share3-wmengr76o39-node-primary-1
prometheus-node-exporter-fnngw   1/1       Running   0          3h        192.168.100.9    share3-wmengr76o39-nrri-2
prometheus-node-exporter-g7km9   1/1       Running   0          3h        192.168.100.4    share3-wmengr76o39-master-etcd-1
prometheus-node-exporter-jlf2v   1/1       Running   0          3h        192.168.100.16   share3-wmengr76o39-node-primary-2
prometheus-node-exporter-k986p   1/1       Running   0          3h        192.168.100.7    share3-wmengr76o39-master-etcd-2


Checked the targets, Endpoints for alertmamager and alert-buffer are down
target for alertmamager and alert-buffer gave HTTP response to HTTPS client.

# oc -n openshift-metrics rsh prometheus-0
sh-4.2$ curl -k https://10.2.2.4:9093/metrics 
curl: (35) SSL received a record that exceeded the maximum permissible length.

Test with http, thers are metrics output
sh-4.2$ curl -k http://10.2.2.4:9093/metrics
# HELP alertmanager_alerts How many alerts by state.
# TYPE alertmanager_alerts gauge
alertmanager_alerts{state="active"} 0
alertmanager_alerts{state="suppressed"} 0
# HELP alertmanager_alerts_invalid_total The total number of received alerts that were invalid.
# TYPE alertmanager_alerts_invalid_total counter
alertmanager_alerts_invalid_total 0
# HELP alertmanager_build_info A metric with a constant '1' value labeled by version, revision, branch, and goversion from which alertmanager was built.
# TYPE alertmanager_build_info gauge
................................................................................


Version-Release number of selected component (if applicable):
prometheus v3.9.45-1

How reproducible:
Always

Steps to Reproduce:
1. Deploy prometheus v3.9.45-1 and check /targets page
2.
3.

Actual results:
endpoint for alertmamager and alert-buffer gave HTTP response to HTTPS client

Expected results:
endpoint should are in UP state

Additional info:

Comment 1 Junqi Zhao 2018-10-12 08:00:59 UTC
This issue only happen with prometheus 3.9, version above 3.10 does not scape alertmamager and alert-buffer

Comment 2 Paul Gier 2018-10-16 20:05:47 UTC
This is due to prometheus automatically discovering the container ports listed in the stateful set config.

https://github.com/openshift/openshift-ansible/pull/10424

Comment 3 Junqi Zhao 2018-10-31 05:19:46 UTC
endpoints for alertmamager and alert-buffer are removed

openshift-ansible: openshift-ansible-3.9.49-1

Comment 6 errata-xmlrpc 2018-12-13 19:27:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3748