Description of problem: Upgrading from 3.9 to 3.10. When the upgrade gets to the "Upgrade all storage" task, oc commands no longer work or take a very long time to timeout. It looks like this: oc get pods -n default E0709 18:59:34.877759 102108 round_trippers.go:169] CancelRequest not implemented E0709 19:00:38.886405 102108 round_trippers.go:169] CancelRequest not implemented E0709 19:01:10.889660 102108 round_trippers.go:169] CancelRequest not implemented E0709 19:01:42.893057 102108 round_trippers.go:169] CancelRequest not implemented NAME READY STATUS RESTARTS AGE docker-registry-1-dkhpr 1/1 Running 0 47m registry-console-1-6jglj 1/1 Running 1 46m router-1-g4d5h 1/1 Running 0 47m Version-Release number of selected component (if applicable): 3.10.15 How reproducible: Always during storage upgrade phase of upgrade Steps to Reproduce: 1. Create a small 3.9 cluster - verify oc commands work 2. Run the upgrade.yml playbook 3. When the upgrade hits "Upgrade all storage", try run an oc get command Actual results: See above Expected results: A more human consumable error message. Additional info:
Created attachment 1457853 [details] loglevel=8 output
Does the oc binary still give the "CancelRequest not implemented" error if used against a different cluster than the one that was just upgraded? Also, can you confirm that the version of `oc` that you're getting this error on is 3.10? Based on your attachment, the error message appears to be originating from the UserAgent round-tripper's CancelRequest method [2]. It could be that, due to altered config during the upgrade process, the round-tripper being used here [3] does not implement a CancelRequest method, causing the error message seen (and the delay in executing commands). Adding David in case he can provide more information as well. 1. https://github.com/openshift/openshift-ansible/blob/release-3.10/playbooks/openshift-master/private/upgrade.yml#L69 2. https://github.com/openshift/origin/blob/release-3.10/vendor/k8s.io/kubernetes/staging/src/k8s.io/client-go/transport/round_trippers.go#L169 3. https://github.com/openshift/origin/blob/release-3.10/vendor/k8s.io/kubernetes/staging/src/k8s.io/client-go/transport/round_trippers.go#L37
1. Using the client against a different cluster is successful. To be clear, the cluster where the error occurs is in the middle of an upgrade, it has not yet been fully upgraded. 2. At the time the error occurs, the client is 3.10 and the api servers are still 3.9: root@ip-172-31-20-191: ~ # oc get pods E0717 20:02:00.400949 8386 round_trippers.go:169] CancelRequest not implemented ^C root@ip-172-31-20-191: ~ # oc version oc v3.10.18 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-31-20-191.us-west-2.compute.internal:8443 openshift v3.9.33 kubernetes v1.9.1+a0ce1bc657
Mike, could you provide --loglevel 8 output? The reason why this is happening is most likely because a wrapped round tripper does not implement the CancelRequest method.
Created attachment 1472760 [details] oc output with loglevel=8 oc v3.10.27 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-31-61-1.us-west-2.compute.internal:8443 openshift v3.9.40 kubernetes v1.9.1+a0ce1bc657
Created attachment 1473652 [details] patched oc log
Origin PR: https://github.com/openshift/origin/pull/20554
Moving to MODIFIED until a build is ready for QE
Mike, FYI, the PR is merged in OCP new puddles >= v3.11.0-0.12.0, thx
Verified on 3.11.0-0.24.0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652