Description of problem: This bug is a regression of https://bugzilla.redhat.com/show_bug.cgi?id=1927364 bug found during upgrade from 4.6.59-x86_64 - > 4.6.0-0.nightly-2022-07-13-184746 OpenShift release version: 4.6.0-0.nightly-2022-07-13-184746 Cluster Platform: How reproducible: Steps to Reproduce (in detail): melvinjoseph@mjoseph-mac Downloads % oc new-project test Now using project "test" on server "https://api.mjoseph-459551.qe.devcluster.openshift.com:6443". melvinjoseph@mjoseph-mac Downloads % oc create -f https://raw.githubusercontent.com/openshift/verification-tests/master/testdata/routing/list_for_caddy.json replicationcontroller/caddy-rc created service/service-secure created service/service-unsecure created melvinjoseph@mjoseph-mac Downloads % oc expose svc service-unsecure route.route.openshift.io/service-unsecure exposed melvinjoseph@mjoseph-mac Downloads % oc get all curl NAME READY STATUS RESTARTS AGE pod/caddy-rc-k9zz7 1/1 Running 0 13s pod/caddy-rc-wxqk5 1/1 Running 0 13s NAME DESIRED CURRENT READY AGE replicationcontroller/caddy-rc 2 2 2 13s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/service-secure ClusterIP 172.30.228.11 <none> 27443/TCP 13s service/service-unsecure ClusterIP 172.30.229.16 <none> 27017/TCP 13s NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/service-unsecure service-unsecure-test.apps.mjoseph-459551.qe.devcluster.openshift.com service-unsecure http None melvinjoseph@mjoseph-mac Downloads % curl service-unsecure-test.apps.mjoseph-459551.qe.devcluster.openshift.com Hello-OpenShift-1 http-8080 melvinjoseph@mjoseph-mac Downloads % oc idle service-unsecure The service "test/service-unsecure" has been marked as idled The service will unidle ReplicationController "test/caddy-rc" to 2 replicas once it receives traffic ReplicationController "test/caddy-rc" has been idled melvinjoseph@mjoseph-mac Downloads % 6. Check the servcie service-unsecure oc get svc service-unsecure -o yaml zsh: command not found: 6. apiVersion: v1 kind: Service metadata: annotations: idling.alpha.openshift.io/idled-at: "2022-07-15T12:10:03Z" idling.alpha.openshift.io/unidle-targets: '[{"kind":"ReplicationController","name":"caddy-rc","replicas":2}]' creationTimestamp: "2022-07-15T12:09:35Z" labels: name: service-unsecure name: service-unsecure namespace: test resourceVersion: "44813" selfLink: /api/v1/namespaces/test/services/service-unsecure uid: 574481cb-e9de-4c97-8d1c-80a2a64009b8 spec: clusterIP: 172.30.229.16 ports: - name: http port: 27017 protocol: TCP targetPort: 8080 selector: name: caddy-pods sessionAffinity: None type: ClusterIP status: loadBalancer: {} melvinjoseph@mjoseph-mac Downloads % melvinjoseph@mjoseph-mac Downloads % oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:3864bf2f74ce66cb596753d2ddd3cb7b8d8977e4e3e70ae2bd9660c92328378d --allow-explicit-upgrade=true --force warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Updating to release image registry.ci.openshift.org/ocp/release@sha256:3864bf2f74ce66cb596753d2ddd3cb7b8d8977e4e3e70ae2bd9660c92328378d melvinjoseph@mjoseph-mac Downloads % oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.6.0-0.nightly-2022-07-13-184746 True False False 30m cloud-credential 4.6.0-0.nightly-2022-07-13-184746 True False False 176m cluster-autoscaler 4.6.0-0.nightly-2022-07-13-184746 True False False 172m config-operator 4.6.0-0.nightly-2022-07-13-184746 True False False 174m console 4.6.0-0.nightly-2022-07-13-184746 True False False 50m csi-snapshot-controller 4.6.0-0.nightly-2022-07-13-184746 True False False 50m dns 4.6.0-0.nightly-2022-07-13-184746 True False False 173m etcd 4.6.0-0.nightly-2022-07-13-184746 True False False 172m image-registry 4.6.0-0.nightly-2022-07-13-184746 True False False 61m ingress 4.6.0-0.nightly-2022-07-13-184746 True False False 166m insights 4.6.0-0.nightly-2022-07-13-184746 True False False 174m kube-apiserver 4.6.0-0.nightly-2022-07-13-184746 True False False 172m kube-controller-manager 4.6.0-0.nightly-2022-07-13-184746 True False False 172m kube-scheduler 4.6.0-0.nightly-2022-07-13-184746 True False False 171m kube-storage-version-migrator 4.6.0-0.nightly-2022-07-13-184746 True False False 61m machine-api 4.6.0-0.nightly-2022-07-13-184746 True False False 168m machine-approver 4.6.0-0.nightly-2022-07-13-184746 True False False 173m machine-config 4.6.0-0.nightly-2022-07-13-184746 True False False 30m marketplace 4.6.0-0.nightly-2022-07-13-184746 True False False 50m monitoring 4.6.0-0.nightly-2022-07-13-184746 True False False 29m network 4.6.0-0.nightly-2022-07-13-184746 True False False 174m node-tuning 4.6.0-0.nightly-2022-07-13-184746 True False False 94m openshift-apiserver 4.6.0-0.nightly-2022-07-13-184746 True False False 30m openshift-controller-manager 4.6.0-0.nightly-2022-07-13-184746 True False False 93m openshift-samples 4.6.0-0.nightly-2022-07-13-184746 True False False 84m operator-lifecycle-manager 4.6.0-0.nightly-2022-07-13-184746 True False False 173m operator-lifecycle-manager-catalog 4.6.0-0.nightly-2022-07-13-184746 True False False 173m operator-lifecycle-manager-packageserver 4.6.0-0.nightly-2022-07-13-184746 True False False 46m service-ca 4.6.0-0.nightly-2022-07-13-184746 True False False 174m storage 4.6.0-0.nightly-2022-07-13-184746 True False False 174m melvinjoseph@mjoseph-mac Downloads % melvinjoseph@mjoseph-mac Downloads % melvinjoseph@mjoseph-mac Downloads % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2022-07-13-184746 True False 29m Cluster version is 4.6.0-0.nightly-2022-07-13-184746 melvinjoseph@mjoseph-mac Downloads % oc get svc service-unsecure -o yaml apiVersion: v1 kind: Service metadata: annotations: idling.alpha.openshift.io/idled-at: "2022-07-15T12:10:03Z" idling.alpha.openshift.io/unidle-targets: '[{"kind":"ReplicationController","name":"caddy-rc","replicas":2}]' creationTimestamp: "2022-07-15T12:09:35Z" labels: name: service-unsecure name: service-unsecure namespace: test resourceVersion: "44813" selfLink: /api/v1/namespaces/test/services/service-unsecure uid: 574481cb-e9de-4c97-8d1c-80a2a64009b8 spec: clusterIP: 172.30.229.16 ports: - name: http port: 27017 protocol: TCP targetPort: 8080 selector: name: caddy-pods sessionAffinity: None type: ClusterIP status: loadBalancer: {} melvinjoseph@mjoseph-mac Downloads % oc get po No resources found in test namespace. melvinjoseph@mjoseph-mac Downloads % oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD service-unsecure service-unsecure-test.apps.mjoseph-459551.qe.devcluster.openshift.com service-unsecure http None melvinjoseph@mjoseph-mac Downloads % curl -Ik http://service-unsecure-test.apps.mjoseph-459551.qe.devcluster.openshift.com HTTP/1.1 503 Service Unavailable Pragma: no-cache Cache-Control: private, max-age=0, no-cache, no-store Content-Type: text/html Date: Fri, 15 Jul 2022 14:19:32 GMT X-Cache: MISS from f4a5b3556007 X-Cache-Lookup: MISS from f4a5b3556007:3128 Via: 1.1 f4a5b3556007 (squid/4.13) Connection: keep-alive melvinjoseph@mjoseph-mac Downloads % oc get po No resources found in test namespace. elvinjoseph@mjoseph-mac Downloads % oc get all NAME DESIRED CURRENT READY AGE replicationcontroller/caddy-rc 0 0 0 131m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/service-secure ClusterIP 172.30.228.11 <none> 27443/TCP 131m service/service-unsecure ClusterIP 172.30.229.16 <none> 27017/TCP 131m NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/service-unsecure service-unsecure-test.apps.mjoseph-459551.qe.devcluster.openshift.com service-unsecure http None melvinjoseph@mjoseph-mac Downloads % oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD service-unsecure service-unsecure-test.apps.mjoseph-459551.qe.devcluster.openshift.com service-unsecure http None Actual results: Expected results: curl service-unsecure-test.apps.mjoseph-rout14.qe.devcluster.openshift.com Hello-OpenShift-1 http-8080 oc get svc service-unsecure -o yaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2021-11-15T08:40:38Z" labels: name: service-unsecure name: service-unsecure namespace: test resourceVersion: "157534" uid: 15468a60-b2e5-4972-a7d3-a10f44e87ecf Impact of the problem: Additional info: ** Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.
profile:- upi-on-vsphere/versioned-installer-vmc7-ovn-static_network-hw14-ci
This is potentially a blocker. I'll raise this with my team to investigate it as soon as we can. What version of oc are you using to idle the route? Do the endpoints and endpointslice objects have the idling annotations set before/after upgrade?
https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4.6.0-0.nightly/release/4.6.0-0.nightly-2022-07-13-184746?from=4.6.59 shows only one change, in ironic-machine-os-downloader; no changes in oc, router, kube-proxy, ovn-kubernetes, or any other component that could conceivably cause this issue.
(In reply to Miciah Dashiel Butler Masters from comment #2) > This is potentially a blocker. I'll raise this with my team to investigate > it as soon as we can. > > What version of oc are you using to idle the route? > Initially i test with 4.10 oc client, but today i tested the same with 4.6.59 oc client. > Do the endpoints and endpointslice objects have the idling annotations set > before/after upgrade? The idling annotations are set before the upgrade.
@Miciah, can be the issue linked to this https://access.redhat.com/solutions/6671241?
(In reply to Melvin Joseph from comment #6) > @Miciah, can be the issue linked to this > https://access.redhat.com/solutions/6671241? This Access article does seem to describe the same issue as this BZ. If I understand the article correctly, this is a long-standing regression in OVN-Kubernetes, not a new regression in 4.6.z. I'll set blocker-.
Team, I was trying to find whether the regression is hitting in all profile and want to share one finding, it seems the issue is not on all profiles of 4.6.z. Today i tested in IPI GCP the bug is not hitting. melvinjoseph@mjoseph-mac Downloads % oc get clusterversion oc version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.59 True False 12m Cluster version is 4.6.59 melvinjoseph@mjoseph-mac Downloads % oc version Client Version: 4.6.59 Server Version: 4.6.59 Kubernetes Version: v1.19.16+8203b20 <----snip---> melvinjoseph@mjoseph-mac Downloads % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2022-07-13-184746 True False 23m Cluster version is 4.6.0-0.nightly-2022-07-13-184746 melvinjoseph@mjoseph-mac Downloads % curl service-unsecure-test.apps.mjoseph-bug1.qe.gcp.devcluster.openshift.com Hello-OpenShift-1 http-8080 melvinjoseph@mjoseph-mac Downloads % oc get infrastructure cluster -o=jsonpath={.spec.platformSpec.type} GCP% and idling annotation is also removed. But the bug hit twice on this `upi-on-vsphere/versioned-installer-vmc7-ovn-static_network-hw14-ci` profile.
Based on <https://access.redhat.com/solutions/6671241> and bug 2041307, comment 12, idling is known not to work on OpenShift 4.7 and earlier when using OVN-Kubernetes. Users who are affected by this issue should upgrade. *** This bug has been marked as a duplicate of bug 2041307 ***