Bug 1646986
Summary: | Tempest test test_pod_vm_ping is stuck | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Itzik Brown <itbrown> |
Component: | python-kuryr-tests-tempest | Assignee: | Yossi Boaron <yboaron> |
Status: | CLOSED ERRATA | QA Contact: | Itzik Brown <itbrown> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 14.0 (Rocky) | CC: | asegurap, gcheresh, jschluet, ltomasbo, tsedovic, yboaron |
Target Milestone: | beta | Keywords: | Triaged |
Target Release: | 15.0 (Stein) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | python-kuryr-tests-tempest-0.4.1-0.20190401185124.0d51e99.el8ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-09-21 11:19:23 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Itzik Brown
2018-11-06 13:14:18 UTC
There is a bug at https://github.com/kubernetes/kubernetes/issues/67457 One of the comments mentions a workaround: Adding a _request_timeout=10 to the dict in exec_command_in_pod: kwargs = dict(command=command, stdin=False, stdout=True, tty=False, stderr=stderr,_request_timeout=10) With this workaround it works ~30% of the time. In Kuryr upstream we use kubernetes v1.9.1 (openshift v3.9.0) and the same K8S python client version (8.0.0) - and all work fine. While in OCP 3.11 (K8S 1.11.0) we are hitting this issue. Seems that with interactive shell approach [1] - things work OK. As Itzik mentioned, other people also reported this issue [2] Our next steps: 1. re-write relevant tempest test to use interactive approach - and recheck. 2. ask for assistance/more info in the kubernetes-client slack channel [1] https://github.com/kubernetes-client/python/blob/3459c173cddc9252f7eb803da9e86aaae08ee653/examples/exec.py#L56 [2] https://github.com/kubernetes/kubernetes/issues/67457 Sometimes the 'connect_get_namespaced_pod_exec' call is hanging from some reason (on OS select) although the command completed. It seems that setting the '_request_timeout' parameter solved the problem for the pod2pod test. Tested with the patch that adds support for request_timeout [1], run 50 times the pod2pod test and all is fine. [1] https://review.openstack.org/#/c/618635/1 the above mentioned patch is not in a build for OSP. RDO Rocky Trunk [1] has it pinned before this patch as well. [1] https://github.com/redhat-openstack/rdoinfo/blob/master/rdo.yml#L7025 The fix will be in OSP15 only Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |