Bug 2084079 - prometheus route is not updated to "path: /api" after upgrade from 4.10 to 4.11
Summary: prometheus route is not updated to "path: /api" after upgrade from 4.10 to 4.11
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.11.0
Assignee: Joao Marcal
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks: 2089806
TreeView+ depends on / blocked
 
Reported: 2022-05-11 10:46 UTC by Junqi Zhao
Modified: 2022-08-10 11:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2089806 (view as bug list)
Environment:
Last Closed: 2022-08-10 11:11:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
access prometheus route now is 404 after upgrade to 4.11 (29.00 KB, image/png)
2022-05-11 10:46 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 1671 0 None open Bug 2084079: Refactors CreateRouteIfNotExists to CreateOrUpdateRoute 2022-05-17 16:05:38 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:11:45 UTC

Description Junqi Zhao 2022-05-11 10:46:34 UTC
Created attachment 1878629 [details]
access prometheus route now is 404 after upgrade to 4.11

Description of problem:
upgrade cluster from 4.10.13 to 4.11.0-0.nightly-2022-05-10-045003 in a SNO cluster

# oc image info registry.ci.openshift.org/ocp/release:4.11.0-0.nightly-2022-05-10-045003
Name:        registry.ci.openshift.org/ocp/release:4.11.0-0.nightly-2022-05-10-045003
Digest:      sha256:0f0789cbecc2598d71fc8ff6e17a0e61c4e2067414388e82cbb3aaab5e6b535f
...

# oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:0f0789cbecc2598d71fc8ff6e17a0e61c4e2067414388e82cbb3aaab5e6b535f --force

# oc get clusterversion -oyaml
...
    desired:
      image: registry.ci.openshift.org/ocp/release@sha256:0f0789cbecc2598d71fc8ff6e17a0e61c4e2067414388e82cbb3aaab5e6b535f
      version: 4.11.0-0.nightly-2022-05-10-045003
    history:
    - completionTime: "2022-05-11T08:38:18Z"
      image: registry.ci.openshift.org/ocp/release@sha256:0f0789cbecc2598d71fc8ff6e17a0e61c4e2067414388e82cbb3aaab5e6b535f
      startedTime: "2022-05-11T07:29:25Z"
      state: Completed
      verified: true
      version: 4.11.0-0.nightly-2022-05-10-045003
    - completionTime: "2022-05-11T04:31:56Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:4f516616baed3cf84585e753359f7ef2153ae139c2e80e0191902fbd073c4143
      startedTime: "2022-05-11T03:53:47Z"
      state: Completed
      verified: false
      version: 4.10.13
    observedGeneration: 10

since 4.11, prometheus-k8s route is update with "path: /api", but after upgrade, it keeps the same with 4.10, access prometheus route would navigate to 
https://${prometheus-k8s-route}/${console-route}/monitoring/graph and error is 
404 page not found

in 4.11, prometheus route should response with "Application is not available"

# oc -n openshift-monitoring get route  | grep prometheus-k8s
prometheus-k8s            prometheus-k8s-openshift-monitoring.apps.qe-upg511.qe.devcluster.openshift.com                        prometheus-k8s      web    reencrypt/Redirect   None
prometheus-k8s-federate   prometheus-k8s-federate-openshift-monitoring.apps.qe-upg511.qe.devcluster.openshift.com   /federate   prometheus-k8s      web    reencrypt/Redirect   None


# oc -n openshift-monitoring get route prometheus-k8s -oyaml
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    openshift.io/host.generated: "true"
  creationTimestamp: "2022-05-11T04:30:30Z"
  name: prometheus-k8s
  namespace: openshift-monitoring
  resourceVersion: "23931"
  uid: aa65b977-35e9-4e8f-9078-ab34397b0d89
spec:
  host: prometheus-k8s-openshift-monitoring.apps.qe-upg511.qe.devcluster.openshift.com
  port:
    targetPort: web
  tls:
    insecureEdgeTerminationPolicy: Redirect
    termination: reencrypt
  to:
    kind: Service
    name: prometheus-k8s
    weight: 100
  wildcardPolicy: None
status:
  ingress:
  - conditions:
    - lastTransitionTime: "2022-05-11T04:30:30Z"
      status: "True"
      type: Admitted
    host: prometheus-k8s-openshift-monitoring.apps.qe-upg511.qe.devcluster.openshift.com
    routerCanonicalHostname: router-default.apps.qe-upg511.qe.devcluster.openshift.com
    routerName: default

this is the only issue, we still can access API with prometheus route, example
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.qe-upg511.qe.devcluster.openshift.com/api/v1/query?query=cluster_infrastructure_provider' | jq
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "cluster_infrastructure_provider",
          "container": "kube-apiserver-operator",
          "endpoint": "https",
          "instance": "10.128.1.6:8443",
          "job": "metrics",
          "namespace": "openshift-kube-apiserver-operator",
          "pod": "kube-apiserver-operator-698f86d976-wxx44",
          "service": "metrics",
          "type": "None"
        },
        "value": [
          1652265840.76,
          "0"
        ]
      }
    ]
  }
}


Version-Release number of selected component (if applicable):
upgrade cluster from 4.10.13 to 4.11.0-0.nightly-2022-05-10-045003

How reproducible:
always

Steps to Reproduce:
1. see the description
2.
3.

Actual results:
prometheus route is not updated to "path: /api" after upgrade from 4.10 to 4.11

Expected results:
should be updated

Additional info:
other monitoring routes don't have such issue

Comment 2 Junqi Zhao 2022-05-12 12:33:43 UTC
in another IPI vSphere cluster, 3 masters/2 workers, upgraded from 4.8.29 -> 4.9.0-0.nightly-2022-05-11-100812 -> 4.10.0-0.nightly-2022-05-11-183751 -> 4.11.0-0.nightly-2022-05-11-054135, same issue with alertmanager-main/thanos-querier/thanos-ruler routes
# oc get node
NAME                           STATUS   ROLES    AGE   VERSION
juzhao-48-8msj7-master-0       Ready    master   9h    v1.23.3+69213f8
juzhao-48-8msj7-master-1       Ready    master   9h    v1.23.3+69213f8
juzhao-48-8msj7-master-2       Ready    master   9h    v1.23.3+69213f8
juzhao-48-8msj7-worker-66zr5   Ready    worker   9h    v1.23.3+69213f8
juzhao-48-8msj7-worker-mnqjb   Ready    worker   9h    v1.23.3+69213f8


# oc -n openshift-monitoring get route
NAME                      HOST/PORT                                                                                 PATH        SERVICES            PORT   TERMINATION          WILDCARD
alertmanager-main         alertmanager-main-openshift-monitoring.apps.juzhao-48.qe.devcluster.openshift.com                     alertmanager-main   web    reencrypt/Redirect   None
prometheus-k8s            prometheus-k8s-openshift-monitoring.apps.juzhao-48.qe.devcluster.openshift.com                        prometheus-k8s      web    reencrypt/Redirect   None
prometheus-k8s-federate   prometheus-k8s-federate-openshift-monitoring.apps.juzhao-48.qe.devcluster.openshift.com   /federate   prometheus-k8s      web    reencrypt/Redirect   None
thanos-querier            thanos-querier-openshift-monitoring.apps.juzhao-48.qe.devcluster.openshift.com                        thanos-querier      web    reencrypt/Redirect   None

# oc -n openshift-user-workload-monitoring get route
NAME           HOST/PORT                                                                                    PATH        SERVICES                   PORT       TERMINATION          WILDCARD
federate       federate-openshift-user-workload-monitoring.apps.juzhao-48.qe.devcluster.openshift.com       /federate   prometheus-user-workload   federate   reencrypt/Redirect   None
thanos-ruler   thanos-ruler-openshift-user-workload-monitoring.apps.juzhao-48.qe.devcluster.openshift.com               thanos-ruler               web        reencrypt/Redirect   None

Comment 12 errata-xmlrpc 2022-08-10 11:11:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.