Bug 1900991

Summary: accessing the route cannot wake up the idled resources
Product: OpenShift Container Platform Reporter: Hongan Li <hongli>
Component: NetworkingAssignee: Andrew McDermott <amcdermo>
Networking sub component: router QA Contact: Arvind iyengar <aiyengar>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: abraj, aiyengar, amcdermo, aos-bugs, aos-network-edge-staff, mjoseph, pducai, rabdulra
Version: 4.6Keywords: Reopened
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1900989 Environment:
Last Closed: 2021-04-16 16:13:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1900989    
Bug Blocks:    

Comment 2 Andrew McDermott 2020-12-04 17:06:26 UTC
Tagging with UpcomingSprint while investigation is either ongoing or
pending. Will be considered for earlier release versions when
diagnosed and resolved.

Comment 4 Andrew McDermott 2021-02-04 08:30:36 UTC
A note for QA testing: you'll need to use an oc binary that comes with the installer as the oc command has been updated to annotate services with the idle annotations.

Comment 6 Arvind iyengar 2021-02-12 06:56:21 UTC
Verified in "4.6.0-0.nightly-2021-02-11-040306" with the same version of oc client. With this payload, it is observed that the idled services wake up and the backend becomes active when the traffic is sent for the respective route: 
-----
$ oc version
Client Version: 4.6.0-0.nightly-2021-02-11-040306
Server Version: 4.6.0-0.nightly-2021-02-11-040306
Kubernetes Version: v1.19.0+6e846d7

$ oc expose svc service-unsecure
route.route.openshift.io/service-unsecure exposed


$ oc get all                    
NAME                 READY   STATUS    RESTARTS   AGE
pod/caddy-rc-6cmm4   1/1     Running   0          47s
pod/caddy-rc-8lfl4   1/1     Running   0          47s

NAME                             DESIRED   CURRENT   READY   AGE
replicationcontroller/caddy-rc   2         2         2       47s

NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
service/service-secure     ClusterIP   172.30.130.117   <none>        27443/TCP   47s
service/service-unsecure   ClusterIP   172.30.45.156    <none>        27017/TCP   46s

NAME                                        HOST/PORT                                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
route.route.openshift.io/service-unsecure   service-unsecure-test1.apps.ci-ln-3nr7xwk-d5d6b.origin-ci-int-aws.dev.rhcloud.com          service-unsecure   http                 None


$ curl service-unsecure-test1.apps.ci-ln-3nr7xwk-d5d6b.origin-ci-int-aws.dev.rhcloud.com -I
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 28
Content-Type: text/html; charset=utf-8
Last-Modified: Tue, 27 Feb 2018 02:43:29 GMT
Server: Caddy
Date: Fri, 12 Feb 2021 06:48:33 GMT
Set-Cookie: e96c07fa08f2609cadf847f019750244=b71de34503fbaacdd109926c9e8c5af9; path=/; HttpOnly
Cache-control: private

$ oc  idle service-unsecure                                                  
WARNING: idling when network policies are in place may cause connections to bypass network policy entirely
The service "test1/service-unsecure" has been marked as idled 
The service will unidle ReplicationController "test1/caddy-rc" to 2 replicas once it receives traffic 
ReplicationController "test1/caddy-rc" has been idled 

$ oc get all
NAME                             DESIRED   CURRENT   READY   AGE
replicationcontroller/caddy-rc   0         0         0       5m50s

NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
service/service-secure     ClusterIP   172.30.130.117   <none>        27443/TCP   5m50s
service/service-unsecure   ClusterIP   172.30.45.156    <none>        27017/TCP   5m49s

NAME                                        HOST/PORT                                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
route.route.openshift.io/service-unsecure   service-unsecure-test1.apps.ci-ln-3nr7xwk-d5d6b.origin-ci-int-aws.dev.rhcloud.com          service-unsecure   http                 None


$ curl service-unsecure-test1.apps.ci-ln-3nr7xwk-d5d6b.origin-ci-int-aws.dev.rhcloud.com -I
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 28
Content-Type: text/html; charset=utf-8
Last-Modified: Tue, 27 Feb 2018 02:43:29 GMT
Server: Caddy
Date: Fri, 12 Feb 2021 06:49:26 GMT
Set-Cookie: e96c07fa08f2609cadf847f019750244=a5ce95ebd48d1fc747c37c00fe549a6b; path=/; HttpOnly
Cache-control: private

$ oc get all
NAME                 READY   STATUS    RESTARTS   AGE
pod/caddy-rc-cdqnl   1/1     Running   0          12s
pod/caddy-rc-gg85h   1/1     Running   0          12s

NAME                             DESIRED   CURRENT   READY   AGE
replicationcontroller/caddy-rc   2         2         2       3m5s

NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
service/service-secure     ClusterIP   172.30.130.117   <none>        27443/TCP   3m5s
service/service-unsecure   ClusterIP   172.30.45.156    <none>        27017/TCP   3m4s

NAME                                        HOST/PORT                                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
route.route.openshift.io/service-unsecure   service-unsecure-test1.apps.ci-ln-3nr7xwk-d5d6b.origin-ci-int-aws.dev.rhcloud.com          service-unsecure   http                 None
-----

Comment 9 errata-xmlrpc 2021-02-22 13:54:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.18 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0510

Comment 10 peter ducai 2021-03-25 09:01:56 UTC
Customer upgraded to 4.6.18 (using 4.6.20 client) but the issue still persists on some routes.

Here's a route that worked with the idle feature:

apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    openshift.io/host.generated: "true"
  name: nginx
  namespace: jpecora-test
spec:
  host: nginx-test.apps.osesbx.mtb.com
  path: /test
  port:
    targetPort: 8080
  tls:
    insecureEdgeTerminationPolicy: Redirect
    termination: edge
  to:
    kind: Service
    name: nginx
    weight: 100
  wildcardPolicy: None


And one that didnt:

oc neat get -- route login-ui -n development
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    meta.helm.sh/release-name: login-ui
    meta.helm.sh/release-namespace: development
  labels:
    app.kubernetes.io/managed-by: Helm
  name: login-ui
  namespace: development
spec:
  host: digitalbanking-sbx.apps.osesbx.mtb.com
  path: /login
  port:
    targetPort: 8080
  tls:
    insecureEdgeTerminationPolicy: Redirect
    termination: edge
  to:
    kind: Service
    name: login-ui
    weight: 100
  wildcardPolicy: None

FYI: adding label openshift.io/host.generated: "true" didn't change anything.

Comment 11 Andrew McDermott 2021-03-25 10:59:13 UTC
(In reply to peter ducai from comment #10)
> Customer upgraded to 4.6.18 (using 4.6.20 client) but the issue still
> persists on some routes.
> 
> Here's a route that worked with the idle feature:
> 
> apiVersion: route.openshift.io/v1
> kind: Route
> metadata:
>   annotations:
>     openshift.io/host.generated: "true"
>   name: nginx
>   namespace: jpecora-test
> spec:
>   host: nginx-test.apps.osesbx.mtb.com
>   path: /test
>   port:
>     targetPort: 8080
>   tls:
>     insecureEdgeTerminationPolicy: Redirect
>     termination: edge
>   to:
>     kind: Service
>     name: nginx
>     weight: 100
>   wildcardPolicy: None
> 
> 
> And one that didnt:
> 
> oc neat get -- route login-ui -n development
> apiVersion: route.openshift.io/v1
> kind: Route
> metadata:
>   annotations:
>     meta.helm.sh/release-name: login-ui
>     meta.helm.sh/release-namespace: development
>   labels:
>     app.kubernetes.io/managed-by: Helm
>   name: login-ui
>   namespace: development
> spec:
>   host: digitalbanking-sbx.apps.osesbx.mtb.com
>   path: /login
>   port:
>     targetPort: 8080
>   tls:
>     insecureEdgeTerminationPolicy: Redirect
>     termination: edge
>   to:
>     kind: Service
>     name: login-ui
>     weight: 100
>   wildcardPolicy: None
> 
> FYI: adding label openshift.io/host.generated: "true" didn't change anything.

Are we saying that both routes/services were idled and only one was automatically unidled when it received traffic?

There was another fix which would ensure that on an upgrade existing idled services would get the annotation added when the cluster-ingress-operator was upgraded. That fix was first available in 4.6.22.

BZ#1927364 - oc idle: Clusters upgrading with an idled workload do not have annotations on the workload's service

Comment 12 Andrew McDermott 2021-04-16 16:13:14 UTC
If you still see this issue in 4.6.22 then please create a new bug.

Comment 13 Red Hat Bugzilla 2023-09-18 00:23:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days