Description of problem: Unidling is not working with OVN Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2019-12-19-105827 How reproducible: always Steps to Reproduce: 1. create test pods/svc # oc new-project test # oc create -f https://raw.githubusercontent.com/anuragthehatter/v3-testfiles/master/networking/list_for_pods.json # oc create -f https://raw.githubusercontent.com/anuragthehatter/v3-testfiles/master/networking/pod-for-ping.json 2. Makes sure pods and services are running # oc get pods 3. check service # oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE test-service ClusterIP 172.30.33.132 <none> 27017/TCP 165m 3. Now idle the test service like # oc idle test-service 4. Check pods again to make sure they are terminating due to step 3 # oc get pods 5. Now from the test pod "hello-pod", curl to the service ip # oc rsh hello-pod / # curl $SERVICE:$SERVICE_PORT Hello OpenShift! / # exit 6. Check the pods again and make sure they are being recreated due to unidling of svc # oc get pods Actual results: curl: (7) Failed to connect to x.x.x.x port 27017: Operation timed out / # command terminated with exit code 7 no test-rc pods are running Expected results: curl outputs "Hello OpenShift!" two test-rc pods are running Additional info:
Moving this to our active development branch (4.4.0) for Target. Fixes, if any, which require backporting to earlier releases will result in cloned BZs for those releases.
I'm inclined to kick this out to 4.4 - I doubt we'll get to this in time.
reproduced on 4.4.0-0.nightly-2020-02-04-122053
*** Bug 1801780 has been marked as a duplicate of this bug. ***
Hey Ross, Can you retest this and see if it still happens with latest 4.5? I just tried in my setup and I see that k8s gets the right event when traffic hits the idled service: [trozet@trozet hack]$ kubectl get events -A NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE default 1s Normal NeedPods service/test-service The service test-service needs pods Thanks.
Unable to reproduce on 4.5.0-0.nightly-2020-04-14-084836 with ~10 seconds timeout after connecting to idled service. From testing it looks like it takes at least 7 seconds to unidle a service. Updated the automation to use a 30 second timeout after triggering unidle, hopefully 30 seconds is sufficient.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409