Bug 1577520

Summary: Error message is wrong when disconnected from the internet
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: ocAssignee: Juan Vallejo <jvallejo>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, ccoleman, jokerman, mmccomas, xxia
Target Milestone: ---Keywords: Rebase, UpcomingRelease
Target Release: 3.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-11 07:19:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clayton Coleman 2018-05-12 17:23:08 UTC
When disconnected from the server (wifi off):

$ oc get cm/ci-operator-image-registry -n ci
error: the server doesn't have a resource type "cm"

We're eating an error too early during discovery incorrectly.  This is marked high severity because from a usability perspective it's completely the wrong class of error and doesn't point the user anywhere near the problem.

It's potentially deferrable.

Comment 1 Xingxing Xia 2018-05-14 01:51:30 UTC
I cannot reproduce by disconnecting the network:
$ oc version
oc v3.10.0-0.38.0
...
Server https://api.free-int...:443
...

Then make the wire unplugged and wifi off, I got below output.
$ oc get cm/myconfigmap -n xxia-proj
The connection to the server api.free-int...:443 was refused - did you specify the right host or port?

PS: there are bug 1538488 and bug 1529482 showing same "the server doesn't have a resource type" error. Maybe you hit one condition of them? :)

Comment 2 Juan Vallejo 2018-05-14 22:19:44 UTC
Can confirm that I receive the same results as comment 1 when disconnected from the internet.

Which version of `oc` are you using?

Comment 3 Juan Vallejo 2018-05-15 22:50:10 UTC
I was finally able to reproduce this locally.
Steps taken:

1. brought up a cluster using `oc cluster up --tag 1.10.0`
2. tested `oc get` successfully with a few resources: `oc get dc`, `oc get deploymentconfigs`, etc.
3. brought cluster down, without disconnecting from the internet: `oc cluster down`
4. tested `oc get` again:

```
$ oc get dc
error: the server doesn't have a resource type "dc"

$ oc get rc
error: the server doesn't have a resource type "rc"

$ oc get po
error: the server doesn't have a resource type "po"
```

The restmapping error only occurred when getting resources by their alias.
Providing the full resource name returned the correct error:

```
$ oc get deploymentconfigs
The connection to the server 127.0.0.1:8443 was refused - did you specify the right host or port?
```

```
$ oc get pods
The connection to the server 127.0.0.1:8443 was refused - did you specify the right host or port?
```

The actual error is getting squashed here: https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/kubectl/cmd/util/factory_object_mapping.go#L93

I was able to reproduce this with kubectl 1.9 and 1.10.

This appears to have been fixed upstream in [1]. Will open a PR picking fix into Origin.

1. https://github.com/kubernetes/kubernetes/commit/ef0d1ab81927214db80c30d5af491f67546d790b#diff-382db7246643fb238d071e21dfedf623

Comment 4 Juan Vallejo 2018-05-15 23:21:43 UTC
Origin PR: https://github.com/openshift/origin/pull/19728

Comment 5 Xingxing Xia 2018-05-16 03:31:07 UTC
Tried ansible-installed env, when the env is down or deleted, it is same error:
$ oc whoami -c # env installed days ago and already pruned today
xxia-proj/ec2-...-compute-1-amazonaws-com:8443/xxia
$ oc get bc
error: the server doesn't have a resource type "bc"

Comment 6 Juan Vallejo 2018-05-23 14:15:48 UTC
*** Bug 1581634 has been marked as a duplicate of this bug. ***

Comment 7 Juan Vallejo 2018-07-12 21:40:05 UTC
I believe the code changes from comment 4 should now be in Origin. Moving in to ON_QA

Comment 9 Xingxing Xia 2018-07-18 09:43:21 UTC
Verified with ./oc version
oc v3.11.0-0.4.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

When the cluster is down or the network wire is unplugged, oc get with alias prompts "Unable to connect to the server ...". The issue is fixed.

Comment 11 errata-xmlrpc 2018-10-11 07:19:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652