Description of problem: The following test is stuck when running it on OSP14 with OCP 3.11 kuryr_tempest_plugin.tests.scenario.test_cross_ping.TestCrossPingScenario.test_pod_vm_ping It seems that it's stuck in exec_command_in_pod Version-Release number of selected component (if applicable): OSP14 openshift v3.11.39 kubernetes v1.11.0+d4cacc0 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
There is a bug at https://github.com/kubernetes/kubernetes/issues/67457 One of the comments mentions a workaround: Adding a _request_timeout=10 to the dict in exec_command_in_pod: kwargs = dict(command=command, stdin=False, stdout=True, tty=False, stderr=stderr,_request_timeout=10) With this workaround it works ~30% of the time.
In Kuryr upstream we use kubernetes v1.9.1 (openshift v3.9.0) and the same K8S python client version (8.0.0) - and all work fine. While in OCP 3.11 (K8S 1.11.0) we are hitting this issue. Seems that with interactive shell approach [1] - things work OK. As Itzik mentioned, other people also reported this issue [2] Our next steps: 1. re-write relevant tempest test to use interactive approach - and recheck. 2. ask for assistance/more info in the kubernetes-client slack channel [1] https://github.com/kubernetes-client/python/blob/3459c173cddc9252f7eb803da9e86aaae08ee653/examples/exec.py#L56 [2] https://github.com/kubernetes/kubernetes/issues/67457
Sometimes the 'connect_get_namespaced_pod_exec' call is hanging from some reason (on OS select) although the command completed. It seems that setting the '_request_timeout' parameter solved the problem for the pod2pod test.
Tested with the patch that adds support for request_timeout [1], run 50 times the pod2pod test and all is fine. [1] https://review.openstack.org/#/c/618635/1
the above mentioned patch is not in a build for OSP. RDO Rocky Trunk [1] has it pinned before this patch as well. [1] https://github.com/redhat-openstack/rdoinfo/blob/master/rdo.yml#L7025
The fix will be in OSP15 only
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811