Created attachment 1366639 [details] Snippet from ansible-playbook console output. Description of problem: When using the advanced installer for OSE 3.7 I get an error stating that "scc" isn't a resource type. Version-Release number of selected component (if applicable): output of "oc version" oc v3.7.9 kubernetes v1.7.6+a08f5eeb62 features: Basic-Auth GSSAPI Kerberos SPNEGO How reproducible: Happens every time I run the playbooks. Also happens when manually calling the oc command on the masters. Steps to Reproduce: 1. Using the "Red Hat OpenShift Container Platform 3.7 RPMs x86_64" repo on all nodes 2. Run advanced installer with inventory file OR 1. Using the "Red Hat OpenShift Container Platform 3.7 RPMs x86_64" repo on a master 2. Call "oc get scc" Actual results: Playbook returns an error on task "Gather OpenShift Logging Facts" "There was an exception trying to run the command '/usr/local/bin/oc get scc privileged --user=system:admin/master-int-ocp-skat-dk:8443 --config=/etc/origin/master/admin.kubeconfig -o json' the server doesn't have a resource type \"scc\"" oc run directly on a master returns same error 'the server doesn't have a resource type "scc"' Expected results: oc should behave the same regardless of whether it's called as "oc get scc" or "oc get securitycontextconstraints" Additional info: After digging around on my master servers i found that I can call "oc get securitycontextconstraints" but not "oc get scc", so the oc binary seems to just not have that abbreviation. Additionally, when typing "oc get" to preview the list of options securitycontextconstraints isn't on the list, causing a bit more confusion. I've attached the playbook console output, but it's more or less the same info.
This shortcut expander was lost in the 1.7 rebase https://github.com/openshift/origin/pull/15234 https://github.com/openshift/origin/pull/15234/commits/75bcc760f259d1c2a1e1ef7efdbeff9174fe6345
Can you please double check and report the exact version you're using. I just tried with: oc v3.7.9-1+7c71a2d kubernetes v1.7.6+a08f5eeb62 and that seems to be working as expected. Can you please also ensure you're not running a 3.7 client against 3.6 cluster, which might be causing this problem?
I was using OC locally on one of the masters, so client and server version were the same. I can't replicate the issue anymore as I've upgraded the cluster to 3.7.14 where the scc expander works as intended.
Thanks Rune for the information, in that case I'm moving this to QA so they can double-check and close as needed.
Workaround, fixes issue (bear with me, it does appear to deal with something complete different). 1. delete the service catalog API reference oc delete apiservices.apiregistration.k8s.io/v1beta1.servicecatalog.k8s.io -n kube-service-catalog 2. re-run the service catalog installer ansible-playbooks -i <you inventory> /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/service-catalog.yml (this may fail, you may need to delete the kube-service-catalog project too, but just to get oc get scc working, don't worry about it failing). 3. delete any of the existing Pods in the service catalog namespace oc delete pods --all -n kube-service-catalog 4. Check oc get scc - it should now work.
Ed am I reading this correctly, is installing the service catalog preventing scc alias from working? Or that only happens in combination with the cert error? It might be that the service catalog creates a similar alias which then leads oc to confusion.
In the environment I was working on (a fresh 3.7.14 deployment), it was an error in service catalog deployment that caused the problem. Checking kube-service-catalog project it was noted that the controller pod was failing with a similar cert error as reported above. I found a big reporting a similar issue here: https://github.com/openshift/origin/issues/17952 with a suggestion that the api registration was incorrect, and to remove and recreate. Doing this I discovered that `oc get scc` now worked (in additional to fixing a few other odd issues, such as some projects not deleting). To answer Maciej's question, the issue was not due to the presence of the service catalog, rather a problem with the deployment of the service catalog. Fixing the service catalog, fixed the problem.
Having same issue on fresh OSE 3.7 installation with "openshift_enable_service_catalog=false" in inventory file. Does service catalog is mandatory to make it works?
Thanks Ed for the thorough explanation.
(In reply to Muhammad Aizuddin Zali from comment #15) > Having same issue on fresh OSE 3.7 installation with > "openshift_enable_service_catalog=false" in inventory file. Does service > catalog is mandatory to make it works? Ignore this comment, somehow after installed on new VM both 'false' and 'true' flag does not produce the issue mentioned.
(In reply to Muhammad Aizuddin Zali from comment #17) > (In reply to Muhammad Aizuddin Zali from comment #15) > > Having same issue on fresh OSE 3.7 installation with > > "openshift_enable_service_catalog=false" in inventory file. Does service > > catalog is mandatory to make it works? > > Ignore this comment, somehow after installed on new VM both 'false' and > 'true' flag does not produce the issue mentioned. v 3.7 is default enabled
*** Bug 1564539 has been marked as a duplicate of this bug. ***
After some investigation of the log that Dmitry is pointing to, it looks like the problem is similar to the one described in https://github.com/openshift/origin/issues/17159, I'm currently working on a fix for that.
PR in flight https://github.com/openshift/origin/pull/19471
Setting status to POST since there's an upstream PR.
Verified with: openshift v3.10.0-0.50.0
*** Bug 1596546 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816
If you're hitting the 'error: the server doesn't have a resource type' issue on 3.9.x, I would look at my comment #15 here: https://bugzilla.redhat.com/show_bug.cgi?id=1624493#c15 Even with this fix, a degraded or slow service catalog or down etcd node can trigger the same issue.