Bug 1820580 - [4.4] some targets are down with Kuryr network
Summary: [4.4] some targets are down with Kuryr network
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.4.0
Assignee: Michał Dulko
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1822861
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-03 11:48 UTC by Junqi Zhao
Modified: 2020-05-04 11:48 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1822861 (view as bug list)
Environment:
Last Closed: 2020-05-04 11:48:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
504 error for prometheus API in console (187.15 KB, image/png)
2020-04-03 11:48 UTC, Junqi Zhao
no flags Details
all targets are down on Prometheus UI (168.28 KB, image/png)
2020-04-03 11:49 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 585 0 None closed Bug 1820580: [release-4.4] Kuryr: Open metric endpoint ports from pod subnets 2020-05-29 08:52:45 UTC
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:48:21 UTC

Description Junqi Zhao 2020-04-03 11:48:04 UTC
Created attachment 1675995 [details]
504 error for prometheus API in console

Description of problem:
504 error for prometheus API in console and all targets are down with Kuryr network, see the attached picture. no such error with other network type
# oc get network/cluster -oyaml
apiVersion: config.openshift.io/v1
kind: Network
metadata:
  creationTimestamp: "2020-04-03T02:32:18Z"
  generation: 2
  name: cluster
  resourceVersion: "2607"
  selfLink: /apis/config.openshift.io/v1/networks/cluster
  uid: c12d517e-585d-45bc-bef1-2a566aec0acd
spec:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  externalIP:
    policy: {}
  networkType: Kuryr
  serviceNetwork:
  - 172.30.0.0/16
status:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  networkType: Kuryr
  serviceNetwork:
  - 172.30.0.0/16

all endpoints are down,see the picture, or see from CLI, example:
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://192.168.0.26:9100/metrics'
curl: (7) Failed connect to 192.168.0.26:9100; Connection timed out

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Login admin UI as cluster admin, check "Home -> Overview" page
2. Check targets in prometheus UI
3.

Actual results:
504 error for prometheus API in console and all targets are down with Kuryr network

Expected results:
no error

Additional info:

Comment 1 Junqi Zhao 2020-04-03 11:49:36 UTC
Created attachment 1675996 [details]
all targets are down on Prometheus UI

Comment 14 Junqi Zhao 2020-04-20 01:17:16 UTC
Will verify it after Bug 1825215 is fixed

Comment 15 Junqi Zhao 2020-04-23 07:26:29 UTC
Tested with 4.4.0-0.nightly-2020-04-21-210658, all targets are UP

Comment 17 errata-xmlrpc 2020-05-04 11:48:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.