1749714 – [IPI] [OSP] sg-worker lack 1936/tcp(router) 9537/tcp(crio metrics) 9101/tcp(sdn metrics) for prometheus pods

Bug 1749714 - [IPI] [OSP] sg-worker lack 1936/tcp(router) 9537/tcp(crio metrics) 9101/tcp(sdn metrics) for prometheus pods

Summary: [IPI] [OSP] sg-worker lack 1936/tcp(router) 9537/tcp(crio metrics) 9101/tcp(s...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.2.0
Assignee:	Martin André
QA Contact:	David Sanz
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-09-06 10:04 UTC by weiwei jiang
Modified:	2019-10-16 06:40 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-10-16 06:40:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift installer pull 2347	0	None	closed	Bug 1749714: OpenStack: Open port 1936 on compute and tighten security groups	2021-02-16 07:22:27 UTC
Red Hat Product Errata	RHBA-2019:2922	0	None	None	None	2019-10-16 06:40:43 UTC

Description weiwei jiang 2019-09-06 10:04:05 UTC

Description of problem:
After launch IPI on OSP cluster, found endpoints are DOWN:
1936/tcp(router) 9537/tcp(crio metrics) 9101/tcp(sdn metrics) 

Version-Release number of the following components:
4.2.0-0.nightly-2019-09-05-234433

How reproducible:
Always

Steps to Reproduce:
1. Launch IPI on OSP cluster
2. check prometheus target dashboard
3.

Actual results:
Found 1936/tcp(router) 9537/tcp(crio metrics) 9101/tcp(sdn metrics) down.
Expected results:

Additional info:

Comment 3 Martin André 2019-09-11 18:18:28 UTC

We've aligned the OpenStack security group rules on AWS ones recently.
The 9537/tcp (crio metrics) and 9101/tcp (sdn metrics) rules should have been added as part of https://github.com/openshift/installer/pull/2304 that merged a couple of days ago.

For 1936/tcp (router), there is no such port open in AWS or GCP security groups. Does it need to be open?

Comment 4 Martin André 2019-09-12 08:54:25 UTC

I've opened port 1936 for the compute nodes in https://github.com/openshift/installer/pull/2347 and also tightened the security group rules to match AWS better.

Comment 6 weiwei jiang 2019-09-16 05:54:58 UTC

Verified on 4.2.0-0.nightly-2019-09-15-052022

➜  ~ oc -n openshift-monitoring exec prometheus-k8s-0 -c prometheus -- curl -s http://localhost:9090/api/v1/query\?query\=up%7Bjob%3D%22crio%22%7D%20or%20up%7Bjob%3D%22sdn%22%7D%20or%20up%7Bjob%3D%22router-internal-default%22%7D | json_reformat 
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "crio",
                    "instance": "192.168.0.15:9537",
                    "job": "crio",
                    "namespace": "kube-system",
                    "node": "share-0916c-8vp8z-master-2",
                    "service": "kubelet"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "crio",
                    "instance": "192.168.0.17:9537",
                    "job": "crio",
                    "namespace": "kube-system",
                    "node": "share-0916c-8vp8z-worker-sc4nc",
                    "service": "kubelet"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "crio",
                    "instance": "192.168.0.18:9537",
                    "job": "crio",
                    "namespace": "kube-system",
                    "node": "share-0916c-8vp8z-worker-ntvnv",
                    "service": "kubelet"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "crio",
                    "instance": "192.168.0.25:9537",
                    "job": "crio",
                    "namespace": "kube-system",
                    "node": "share-0916c-8vp8z-master-1",
                    "service": "kubelet"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "crio",
                    "instance": "192.168.0.33:9537",
                    "job": "crio",
                    "namespace": "kube-system",
                    "node": "share-0916c-8vp8z-worker-sv7x8",
                    "service": "kubelet"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "crio",
                    "instance": "192.168.0.39:9537",
                    "job": "crio",
                    "namespace": "kube-system",
                    "node": "share-0916c-8vp8z-master-0",
                    "service": "kubelet"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.15:9101",
                    "job": "sdn",
                    "namespace": "openshift-sdn",
                    "pod": "sdn-pnbxm",
                    "service": "sdn"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.17:9101",
                    "job": "sdn",
                    "namespace": "openshift-sdn",
                    "pod": "sdn-75l56",
                    "service": "sdn"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.18:9101",
                    "job": "sdn",
                    "namespace": "openshift-sdn",
                    "pod": "sdn-rkm9w",
                    "service": "sdn"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.25:9101",
                    "job": "sdn",
                    "namespace": "openshift-sdn",
                    "pod": "sdn-c4vbz",
                    "service": "sdn"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.33:9101",
                    "job": "sdn",
                    "namespace": "openshift-sdn",
                    "pod": "sdn-ngvhb",
                    "service": "sdn"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.39:9101",
                    "job": "sdn",
                    "namespace": "openshift-sdn",
                    "pod": "sdn-zm8r9",
                    "service": "sdn"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.17:1936",
                    "job": "router-internal-default",
                    "namespace": "openshift-ingress",
                    "pod": "router-default-594bb9c7cc-lb2v7",
                    "service": "router-internal-default"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            },
            {
                "metric": {
                    "__name__": "up",
                    "endpoint": "metrics",
                    "instance": "192.168.0.18:1936",
                    "job": "router-internal-default",
                    "namespace": "openshift-ingress",
                    "pod": "router-default-594bb9c7cc-qkhlv",
                    "service": "router-internal-default"
                },
                "value": [
                    1568613183.383,
                    "1"
                ]
            }
        ]
    }
}

Comment 7 errata-xmlrpc 2019-10-16 06:40:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Note You need to log in before you can comment on or make changes to this bug.