+++ This bug was initially created as a clone of Bug #1567043 +++ Description of problem: cannot wake up the resources after idling service Version-Release number of selected component (if applicable): openshift v3.4.1.44.52 kubernetes v1.4.0+776c994 How reproducible: always Steps to Reproduce: 1. create rc (pod,svc) # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/list_for_caddy.json 2. idling the service # oc idle service-unsecure -n lha Marked service lha/service-unsecure to unidle resource ReplicationController lha/caddy-rc (unidle to 2 replicas) Idled ReplicationController lha/caddy-rc (dry run) note: tried both options "--dry-run=false" and "--dry-run=true" but above output always show "(dry run)". 3. Generate some traffic to un-idle the service # curl 172.30.202.54:27017 curl: (7) Failed connect to 172.30.202.54:27017; No route to host Actual results: cannot wake up the resource, and the iptables is not correct after idling (no random port opened for the idled service) [root@host-8-242-109 ~]# iptables-save | grep lha -A KUBE-SERVICES -d 172.30.214.106/32 -p tcp -m comment --comment "lha/service-secure:https cluster IP" -m tcp --dport 27443 -j KUBE-SVC-P6NT6I2XSZW2EWVD -A KUBE-SERVICES -d 172.30.214.106/32 -p tcp -m comment --comment "lha/service-secure:https has no endpoints" -m tcp --dport 27443 -j REJECT --reject-with icmp-port-unreachable Expected results: should wake up the resource when receiving traffic Additional info: the iptables looks good before idling, see below # oc get svc -n lha NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE service-secure 172.30.214.106 <none> 27443/TCP 20s service-unsecure 172.30.202.54 <none> 27017/TCP 20s [root@host-8-242-109 ~]# curl 172.30.202.54:27017 Hello-OpenShift-1 http-8080 [root@host-8-242-109 ~]# [root@host-8-242-109 ~]# iptables-save | grep lha -A KUBE-SEP-GFQRI2E3EIJELQBB -s 10.130.0.17/32 -m comment --comment "lha/service-secure:https" -j KUBE-MARK-MASQ -A KUBE-SEP-GFQRI2E3EIJELQBB -p tcp -m comment --comment "lha/service-secure:https" -m tcp -j DNAT --to-destination 10.130.0.17:8443 -A KUBE-SEP-VFVCTHYVGKJKI5D5 -s 10.130.0.17/32 -m comment --comment "lha/service-unsecure:http" -j KUBE-MARK-MASQ -A KUBE-SEP-VFVCTHYVGKJKI5D5 -p tcp -m comment --comment "lha/service-unsecure:http" -m tcp -j DNAT --to-destination 10.130.0.17:8080 -A KUBE-SEP-VGWZGBHRIB24XXKZ -s 10.128.0.20/32 -m comment --comment "lha/service-unsecure:http" -j KUBE-MARK-MASQ -A KUBE-SEP-VGWZGBHRIB24XXKZ -p tcp -m comment --comment "lha/service-unsecure:http" -m tcp -j DNAT --to-destination 10.128.0.20:8080 -A KUBE-SEP-XKULLKEY4RDFTDQL -s 10.128.0.20/32 -m comment --comment "lha/service-secure:https" -j KUBE-MARK-MASQ -A KUBE-SEP-XKULLKEY4RDFTDQL -p tcp -m comment --comment "lha/service-secure:https" -m tcp -j DNAT --to-destination 10.128.0.20:8443 -A KUBE-SERVICES -d 172.30.202.54/32 -p tcp -m comment --comment "lha/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-SVC-CQEG2R4O4IX66RKH -A KUBE-SERVICES -d 172.30.214.106/32 -p tcp -m comment --comment "lha/service-secure:https cluster IP" -m tcp --dport 27443 -j KUBE-SVC-P6NT6I2XSZW2EWVD -A KUBE-SVC-CQEG2R4O4IX66RKH -m comment --comment "lha/service-unsecure:http" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-VGWZGBHRIB24XXKZ -A KUBE-SVC-CQEG2R4O4IX66RKH -m comment --comment "lha/service-unsecure:http" -j KUBE-SEP-VFVCTHYVGKJKI5D5 -A KUBE-SVC-P6NT6I2XSZW2EWVD -m comment --comment "lha/service-secure:https" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-XKULLKEY4RDFTDQL -A KUBE-SVC-P6NT6I2XSZW2EWVD -m comment --comment "lha/service-secure:https" -j KUBE-SEP-GFQRI2E3EIJELQBB --- Additional comment from Ben Bennett on 2018-04-28 02:58:10 CST --- Idling was tech preview in 3.4. We are tracking a later idling bug with https://bugzilla.redhat.com/show_bug.cgi?id=1562184 and it is probably the same root cause, but we aren't going to backport to 3.4 anyway. --- Additional comment from hongli on 2018-05-14 11:22:19 CST --- Looks the issue is related to cloudprovider disable, cannot reproduce the problem in 3.4.1.44.53 on OpenStack + Cloudprovider enable. [root@host-8-243-47 ~]# oc get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE service-secure 172.30.250.212 <none> 27443/TCP 20s service-unsecure 172.30.114.124 <none> 27017/TCP 20s [root@host-8-243-47 ~]# oc idle service-unsecure Marked service lha/service-unsecure to unidle resource ReplicationController lha/caddy-rc (unidle to 2 replicas) Idled ReplicationController lha/caddy-rc (dry run) [root@host-8-243-47 ~]# [root@host-8-243-47 ~]# [root@host-8-243-47 ~]# iptables-save | grep lha -A KUBE-PORTALS-CONTAINER -d 172.30.114.124/32 -p tcp -m comment --comment "lha/service-unsecure:http" -m tcp --dport 27017 -j DNAT --to-destination 172.16.120.79:40540 -A KUBE-PORTALS-HOST -d 172.30.114.124/32 -p tcp -m comment --comment "lha/service-unsecure:http" -m tcp --dport 27017 -j DNAT --to-destination 172.16.120.79:40540 -A KUBE-SERVICES -d 172.30.250.212/32 -p tcp -m comment --comment "lha/service-secure:https cluster IP" -m tcp --dport 27443 -j KUBE-SVC-P6NT6I2XSZW2EWVD -A KUBE-SERVICES -d 172.30.250.212/32 -p tcp -m comment --comment "lha/service-secure:https has no endpoints" -m tcp --dport 27443 -j REJECT --reject-with icmp-port-unreachable [root@host-8-243-47 ~]# [root@host-8-243-47 ~]# curl 172.30.114.124:27017 Hello-OpenShift-1 http-8080 [root@host-8-243-47 ~]# [root@host-8-243-47 ~]# [root@host-8-243-47 ~]# oc version oc v3.4.1.44.53 kubernetes v1.4.0+776c994 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://host-8-243-47.host.centralci.eng.rdu2.redhat.com:8443 openshift v3.4.1.44.53 kubernetes v1.4.0+776c994 [root@host-8-243-47 ~]# [root@host-8-243-47 ~]# --- Additional comment from hongli on 2018-05-14 16:07:07 CST --- Do more testing and narrow down the reproducing condition to OCP on "OpenStack + Cloudprovider disabled".
can reproduce the same issue in OCP v3.5.5.31.67
Idling was tech preview in 3.4, 3.5, and 3.6. Closing this since it is fixed in the current releases.