Bug 1771358 - Unable to provision the cinder volume in OCP 4.2 on OSP 13
Summary: Unable to provision the cinder volume in OCP 4.2 on OSP 13
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.2.z
Assignee: Mike Fedosin
QA Contact: David Sanz
URL:
Whiteboard:
Depends On: 1772002
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-12 09:07 UTC by Sudarshan Chaudhari
Modified: 2023-09-07 20:59 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1772002 (view as bug list)
Environment:
Last Closed: 2020-01-07 17:55:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 2660 0 'None' closed [release-4.2] Bug 1771358: manually prepare cloud provider's config 2021-02-17 18:30:52 UTC
Red Hat Product Errata RHBA-2020:0014 0 None None None 2020-01-07 17:55:23 UTC

Description Sudarshan Chaudhari 2019-11-12 09:07:13 UTC
Description of problem:

Unable to provision the dynamic volume from Cinder. 
Getting No error in the events of PVC while we observe errors as [1] in kube-controller-manager logs.

- This is IPI installation.
- Verified clouds.yaml file and everything looks correct.
- No event is captured in Stack server to verify if request is reaching openstack server.
- Attaching the clouds.yaml and mustgather information.

Version-Release number of selected component (if applicable):
4.2

Actual results:
THe cinder volume provisioning is failing with error [1]

Expected results:
Should be completed.

Master Log:

[1] 
~~~
2019-11-10T13:19:13.642273749Z E1110 13:19:13.642213       1 openstack.go:295] Cannot initialize cloud provider using data from the secret.
2019-11-10T13:19:13.642273749Z E1110 13:19:13.642250       1 openstack_instances.go:69] cannot initialize cloud provider, only limited functionality is available : cloud provider is not initialized
2019-11-10T13:19:13.990799329Z E1110 13:19:13.990710       1 openstack.go:295] Cannot initialize cloud provider using data from the secret.
2019-11-10T13:19:13.990799329Z E1110 13:19:13.990755       1 core.go:77] Failed to start service controller: the cloud provider does not support external load balancers
2019-11-10T13:19:14.333330569Z E1110 13:19:14.333298       1 goroutinemap.go:150] Operation for "provision-openshift-logging/test[f746c14b-03b9-11ea-9868-fa163e7e4b6e]" failed. No retries permitted until 2019-11-10 13:19:14.833242519 +0000 UTC m=+54.363015165 (durationBeforeRetry 500ms). Error: "recovered from panic \"runtime error: invalid memory address or nil pointer dereference\". (err=<nil>) Call stack:\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:158\n/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:522\n/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:82\n/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/signal_unix.go:390\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/gophercloud/gophercloud/openstack/client.go:300\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/gophercloud/gophercloud/openstack/client.go:344\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/cloudprovider/providers/openstack/openstack_client.go:72\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/cloudprovider/providers/openstack/openstack.go:919\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/cloudprovider/providers/openstack/openstack_volumes.go:459\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/volume/cinder/cinder_util.go:218\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/volume/cinder/cinder.go:561\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/volume/persistentvolume/pv_controller.go:1483\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/volume/persistentvolume/pv_controller.go:1346\n/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/goroutinemap/goroutinemap.go:115\n/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/asm_amd64.s:1337\n"
~~~

StorageClass Dump (if StorageClass used by PV/PVC):
~~~
cat ./cluster-scoped-resources/storage.k8s.io/storageclasses/gp2.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: 2019-11-10T12:41:34Z
  name: gp2
  resourceVersion: "468334"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/gp2
  uid: 6db49b6b-03b7-11ea-9868-fa163e7e4b6e
parameters:
  availability: nova
  type: standarad
provisioner: kubernetes.io/cinder
reclaimPolicy: Delete
volumeBindingMode: Immediate

~~~


Additional info:
Similar Links:
-
https://github.com/openshift/installer/issues/2151
https://github.com/openshift/cluster-kube-controller-manager-operator/pull/275
https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/cloudprovider/providers/openstack/openstack_instances.go#L69
https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/cloudprovider/providers/openstack/openstack_instances.go

Comment 3 Ryan Howe 2019-11-12 17:19:18 UTC
Taking a guess it seems as thought our cloud credentials are not set right

   openstack.go:295] Cannot initialize cloud provider using data from the secret.


Cloud config is seen here: 
# oc get cm -n openshift-config cloud-provider-config -o yaml 
Copied here for the api to use: 
# ocs get cm -n openshift-kube-apiserver cloud-config -o yaml


By default the openstack credentials are set via a secret: 

    secret-name = openstack-credentials
    secret-namespace = kube-system


# oc get secret -n  kube-system  openstack-credentials --template='{{index .data "clouds.yaml"}}' | base64 -d
# oc get secret -n  kube-system  openstack-credentials --template='{{index .data "clouds.conf"}}' | base64 -d

If those are not correct then you will likely need to update them with the correct credentials and auth to make api calls to OpenStack.

Comment 4 Mike Fedosin 2019-11-12 18:01:08 UTC
In the StorageClass description I see a typo in "type" section:
parameters:
  availability: nova
  type: standarad

I think it should be standarad -> standard
Maybe that's not the reason for the bug, but you probably want to fix it.


As Ryan Howe said this error happens when kube-controller-manager tries to authenticate in OpenStack using the auth data from the secret. Looks like it happens here: https://github.com/openshift/origin/blob/release-4.2/vendor/k8s.io/kubernetes/pkg/cloudprovider/providers/openstack/openstack.go#L420
Please, compare data from clouds.conf with clouds.yaml, probably there are some differences
# oc get secret -n  kube-system  openstack-credentials --template='{{index .data "clouds.yaml"}}' | base64 -d
# oc get secret -n  kube-system  openstack-credentials --template='{{index .data "clouds.conf"}}' | base64 -d


Also my thoughts: now it's not clear what's exactly the cause of the failure, because the original bug is hidden under the message https://github.com/openshift/origin/blob/release-4.2/vendor/k8s.io/kubernetes/pkg/cloudprovider/providers/openstack/openstack.go#L296 
I'm going to create a bz for this and propose a pr to wrap the original err.

Comment 5 Mike Fedosin 2019-11-12 19:01:19 UTC
As I promised, I've proposed a patch that gives a clearer understanding of the initial error https://github.com/openshift/origin/pull/24136 and its 4.2 version https://github.com/openshift/origin/pull/24137

Comment 11 David Sanz 2019-12-02 15:15:30 UTC
$ oc get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
myclaim   Bound    pvc-3a4d4c56-1516-11ea-996d-fa163eb648be   8Gi        RWO            standard       47s

$ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS   REASON   AGE
pvc-3a4d4c56-1516-11ea-996d-fa163eb648be   8Gi        RWO            Delete           Bound    default/myclaim   standard                46s
$ openstack volume list | grep -i morenod-ocp
| 29f2ad31-4e11-4c53-b251-c71f02c55031 | morenod-ocp-79sdx-dynamic-pvc-3a4d4c56-1516-11ea-996d-fa163eb648be     | in-use    |    8 | Attached to morenod-ocp-79sdx-worker-fx4cb on /dev/vdb        |

Comment 14 David Sanz 2019-12-19 16:12:34 UTC
Verified on 4.2.0-0.nightly-2019-12-15-230238

(overcloud) [stack@undercloud-0 4.2]$ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                   STORAGECLASS   REASON   AGE
pvc-b3556846-2279-11ea-9e5c-fa163e5e17e3   3Gi        RWO            Delete           Bound    default/task-pv-claim   standard                55s
(overcloud) [stack@undercloud-0 4.2]$ oc get pvc
NAME            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
task-pv-claim   Bound    pvc-b3556846-2279-11ea-9e5c-fa163e5e17e3   3Gi        RWO            standard       4m13s

Comment 16 errata-xmlrpc 2020-01-07 17:55:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0014


Note You need to log in before you can comment on or make changes to this bug.