Created attachment 1366639 [details]
Snippet from ansible-playbook console output.
Description of problem:
When using the advanced installer for OSE 3.7 I get an error stating that "scc" isn't a resource type.
Version-Release number of selected component (if applicable):
output of "oc version"
features: Basic-Auth GSSAPI Kerberos SPNEGO
Happens every time I run the playbooks. Also happens when manually calling the oc command on the masters.
Steps to Reproduce:
1. Using the "Red Hat OpenShift Container Platform 3.7 RPMs x86_64" repo on all nodes
2. Run advanced installer with inventory file
1. Using the "Red Hat OpenShift Container Platform 3.7 RPMs x86_64" repo on a master
2. Call "oc get scc"
Playbook returns an error on task "Gather OpenShift Logging Facts"
"There was an exception trying to run the command '/usr/local/bin/oc get scc privileged --user=system:admin/master-int-ocp-skat-dk:8443 --config=/etc/origin/master/admin.kubeconfig -o json' the server doesn't have a resource type \"scc\""
oc run directly on a master returns same error 'the server doesn't have a resource type "scc"'
oc should behave the same regardless of whether it's called as "oc get scc" or "oc get securitycontextconstraints"
After digging around on my master servers i found that I can call "oc get securitycontextconstraints" but not "oc get scc", so the oc binary seems to just not have that abbreviation.
Additionally, when typing "oc get" to preview the list of options securitycontextconstraints isn't on the list, causing a bit more confusion.
I've attached the playbook console output, but it's more or less the same info.
This shortcut expander was lost in the 1.7 rebase
Can you please double check and report the exact version you're using. I just tried with:
and that seems to be working as expected. Can you please also ensure you're not running a 3.7 client against 3.6 cluster, which might be causing this problem?
I was using OC locally on one of the masters, so client and server version were the same.
I can't replicate the issue anymore as I've upgraded the cluster to 3.7.14 where the scc expander works as intended.
Thanks Rune for the information, in that case I'm moving this to QA so they can double-check and close as needed.
Workaround, fixes issue (bear with me, it does appear to deal with something complete different).
1. delete the service catalog API reference
oc delete apiservices.apiregistration.k8s.io/v1beta1.servicecatalog.k8s.io -n kube-service-catalog
2. re-run the service catalog installer
ansible-playbooks -i <you inventory> /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/service-catalog.yml
(this may fail, you may need to delete the kube-service-catalog project too, but just to get oc get scc working, don't worry about it failing).
3. delete any of the existing Pods in the service catalog namespace
oc delete pods --all -n kube-service-catalog
4. Check oc get scc - it should now work.
Ed am I reading this correctly, is installing the service catalog preventing scc alias from working? Or that only happens in combination with the cert error? It might be that the service catalog creates a similar alias which then leads oc to confusion.
In the environment I was working on (a fresh 3.7.14 deployment), it was an error in service catalog deployment that caused the problem. Checking kube-service-catalog project it was noted that the controller pod was failing with a similar cert error as reported above.
I found a big reporting a similar issue here: https://github.com/openshift/origin/issues/17952 with a suggestion that the api registration was incorrect, and to remove and recreate.
Doing this I discovered that `oc get scc` now worked (in additional to fixing a few other odd issues, such as some projects not deleting).
To answer Maciej's question, the issue was not due to the presence of the service catalog, rather a problem with the deployment of the service catalog. Fixing the service catalog, fixed the problem.
Having same issue on fresh OSE 3.7 installation with "openshift_enable_service_catalog=false" in inventory file. Does service catalog is mandatory to make it works?
Thanks Ed for the thorough explanation.
(In reply to Muhammad Aizuddin Zali from comment #15)
> Having same issue on fresh OSE 3.7 installation with
> "openshift_enable_service_catalog=false" in inventory file. Does service
> catalog is mandatory to make it works?
Ignore this comment, somehow after installed on new VM both 'false' and 'true' flag does not produce the issue mentioned.
(In reply to Muhammad Aizuddin Zali from comment #17)
> (In reply to Muhammad Aizuddin Zali from comment #15)
> > Having same issue on fresh OSE 3.7 installation with
> > "openshift_enable_service_catalog=false" in inventory file. Does service
> > catalog is mandatory to make it works?
> Ignore this comment, somehow after installed on new VM both 'false' and
> 'true' flag does not produce the issue mentioned.
v 3.7 is default enabled
*** Bug 1564539 has been marked as a duplicate of this bug. ***
After some investigation of the log that Dmitry is pointing to, it looks like the problem is similar to the one described in https://github.com/openshift/origin/issues/17159, I'm currently working on a fix for that.
PR in flight https://github.com/openshift/origin/pull/19471
Setting status to POST since there's an upstream PR.
*** Bug 1596546 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
If you're hitting the 'error: the server doesn't have a resource type' issue on 3.9.x, I would look at my comment #15 here: https://bugzilla.redhat.com/show_bug.cgi?id=1624493#c15
Even with this fix, a degraded or slow service catalog or down etcd node can trigger the same issue.