Bug 1883421 - csi-snapshot-controller pod reports invalid memory address or nil pointer deference
Summary: csi-snapshot-controller pod reports invalid memory address or nil pointer def...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.0
Assignee: Jan Safranek
QA Contact: Qin Ping
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-29 07:11 UTC by Qin Ping
Modified: 2020-10-27 16:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:45:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift csi-external-snapshotter pull 30 0 None closed Bug 1883421: UPSTREAM: 381: Fix panic when source PVC does not exist 2021-02-13 22:37:14 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:46:14 UTC

Description Qin Ping 2020-09-29 07:11:54 UTC
Description of Problem:
csi-snapshot-controller pod reports invalid memory address or nil pointer deference

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-09-27-075304

How Reproducible:
Always


Steps to Reproduce:
1. Deployed hostpath driver on OCP cluster(https://github.com/kubernetes-csi/csi-driver-host-path)
$ oc get pod
NAME                         READY   STATUS    RESTARTS   AGE
csi-hostpath-attacher-0      1/1     Running   0          23m
csi-hostpath-provisioner-0   1/1     Running   0          23m
csi-hostpath-resizer-0       1/1     Running   0          23m
csi-hostpath-snapshotter-0   1/1     Running   0          16m
csi-hostpath-socat-0         1/1     Running   0          23m
csi-hostpathplugin-0         3/3     Running   0          23m
$ oc get pod csi-hostpath-snapshotter-0 -oyaml|grep "image:"
            f:image: {}
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fdbe52f0ba8313126ecb356569a149a63b44b010a358263a7f7d48066249b332
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:fdbe52f0ba8313126ecb356569a149a63b44b010a358263a7f7d48066249b332

2. Created storage class for hostpath csi driver
$ oc get sc csi-hostpath-sc -oyaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2020-09-29T06:36:31Z"
  managedFields:
  - apiVersion: storage.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:allowVolumeExpansion: {}
      f:provisioner: {}
      f:reclaimPolicy: {}
      f:volumeBindingMode: {}
    manager: kubectl-create
    operation: Update
    time: "2020-09-29T06:36:31Z"
  name: csi-hostpath-sc
  resourceVersion: "1174947"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/csi-hostpath-sc
  uid: ec4bcad4-e06e-4ca1-91f0-cde1873c1568
provisioner: hostpath.csi.k8s.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

3. Created volumesnapshotclass
$ oc get volumesnapshotclass csi-hostpath-snapclass -oyaml
apiVersion: snapshot.storage.k8s.io/v1beta1
deletionPolicy: Delete
driver: hostpath.csi.k8s.io
kind: VolumeSnapshotClass
metadata:
  creationTimestamp: "2020-09-29T06:20:21Z"
  generation: 1
  managedFields:
  - apiVersion: snapshot.storage.k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:deletionPolicy: {}
      f:driver: {}
    manager: kubectl-create
    operation: Update
    time: "2020-09-29T06:20:21Z"
  name: csi-hostpath-snapclass
  resourceVersion: "1161370"
  selfLink: /apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotclasses/csi-hostpath-snapclass
  uid: 60e0be07-4ce3-417a-a9d2-7ba6965e5963

4. Created a PVC
$ oc get pvc csi-pvc -o json| jq .spec
{
  "accessModes": [
    "ReadWriteOnce"
  ],
  "resources": {
    "requests": {
      "storage": "1Gi"
    }
  },
  "storageClassName": "csi-hostpath-sc",
  "volumeMode": "Filesystem",
  "volumeName": "pvc-e7ea5e09-aad9-4f55-af6b-8b0e5ec45247"
}

5. Created a snapshot for PVC
$ cat csi-snapshot.yaml 
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: new-snapshot-demo
spec:
  snapshotClassName: csi-hostpath-snapclass
  source:
    name: csi-pvc
    kind: PersistentVolumeClaim


Actual Results:
csi-snapshot-controller pod status is "CrashLoopBackOff"

Expected Results:
snapshot can be created successfully for hostpath csi storage.

Additional info:
[piqin@preserve-storage-server1 examples]$ oc get pod -n openshift-cluster-storage-operator 
NAME                                                READY   STATUS             RESTARTS   AGE
cluster-storage-operator-6797588974-6njnc           1/1     Running            0          22h
csi-snapshot-controller-74c55d9d5-ftfmm             0/1     CrashLoopBackOff   10         22h
csi-snapshot-controller-operator-689d8b4c98-vmk5w   1/1     Running            0          22h
[piqin@preserve-storage-server1 examples]$ oc -n openshift-cluster-storage-operator logs csi-snapshot-controller-74c55d9d5-ftfmm
I0929 06:42:37.983579       1 main.go:66] Version: v4.6.0-202009221732.p0-0-g9bd988d-dirty
I0929 06:42:37.987017       1 main.go:93] Start NewCSISnapshotController with kubeconfig [] resyncPeriod [1m0s]
I0929 06:42:37.992905       1 leaderelection.go:243] attempting to acquire leader lease  openshift-cluster-storage-operator/snapshot-controller-leader...
I0929 06:42:38.046050       1 leaderelection.go:253] successfully acquired lease openshift-cluster-storage-operator/snapshot-controller-leader
I0929 06:42:38.046694       1 reflector.go:207] Starting reflector *v1beta1.VolumeSnapshotClass (1m0s) from github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117
I0929 06:42:38.046722       1 snapshot_controller_base.go:128] Starting snapshot controller
I0929 06:42:38.046883       1 reflector.go:207] Starting reflector *v1.PersistentVolumeClaim (1m0s) from k8s.io/client-go/informers/factory.go:134
I0929 06:42:38.046932       1 reflector.go:207] Starting reflector *v1beta1.VolumeSnapshotContent (1m0s) from github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117
I0929 06:42:38.048265       1 reflector.go:207] Starting reflector *v1beta1.VolumeSnapshot (1m0s) from github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117
E0929 06:42:38.147021       1 snapshot_controller_base.go:338] checkAndUpdateSnapshotClass failed to setDefaultClass the snapshot source PVC name is not specified
E0929 06:42:38.147118       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 161 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1446fc0, 0x201e670)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa6
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x89
panic(0x1446fc0, 0x201e670)
        /usr/lib/golang/src/runtime/panic.go:969 +0x175
github.com/kubernetes-csi/external-snapshotter/v3/pkg/common-controller.(*csiSnapshotCommonController).syncSnapshotByKey(0xc0001f8e00, 0xc0006a6ae0, 0x19, 0x0, 0xbc)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:215 +0x9d7
github.com/kubernetes-csi/external-snapshotter/v3/pkg/common-controller.(*csiSnapshotCommonController).snapshotWorker(0xc0001f8e00)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:188 +0xed
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0006ba8b0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0006ba8b0, 0x1774260, 0xc0001f0030, 0x1, 0xc00002a1e0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xad
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0006ba8b0, 0x0, 0x0, 0x1, 0xc00002a1e0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc0006ba8b0, 0x0, 0xc00002a1e0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d
created by github.com/kubernetes-csi/external-snapshotter/v3/pkg/common-controller.(*csiSnapshotCommonController).Run
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:139 +0x2ae
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xa0 pc=0x12b1d97]

goroutine 161 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x10c
panic(0x1446fc0, 0x201e670)
        /usr/lib/golang/src/runtime/panic.go:969 +0x175
github.com/kubernetes-csi/external-snapshotter/v3/pkg/common-controller.(*csiSnapshotCommonController).syncSnapshotByKey(0xc0001f8e00, 0xc0006a6ae0, 0x19, 0x0, 0xbc)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:215 +0x9d7
github.com/kubernetes-csi/external-snapshotter/v3/pkg/common-controller.(*csiSnapshotCommonController).snapshotWorker(0xc0001f8e00)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:188 +0xed
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0006ba8b0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0006ba8b0, 0x1774260, 0xc0001f0030, 0x1, 0xc00002a1e0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xad
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0006ba8b0, 0x0, 0x0, 0x1, 0xc00002a1e0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc0006ba8b0, 0x0, 0xc00002a1e0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d
created by github.com/kubernetes-csi/external-snapshotter/v3/pkg/common-controller.(*csiSnapshotCommonController).Run
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:139 +0x2ae

Comment 3 Jan Safranek 2020-09-29 12:56:38 UTC
This snapshot YAML is wrong:


> apiVersion: snapshot.storage.k8s.io/v1beta1
> kind: VolumeSnapshot
> metadata:
>   name: new-snapshot-demo
> spec:
>   snapshotClassName: csi-hostpath-snapclass
>   source:
>     name: csi-pvc
>     kind: PersistentVolumeClaim


Correct one:
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: new-snapshot-jsafrane
spec:
  volumeSnapshotClassName: csi-hostpath-snapclass
  source:
    persistentVolumeClaimName: csi-pvc

Still, the snapshot-controller should be robust enough to handle this.

Comment 4 Jan Safranek 2020-09-29 13:06:10 UTC
Filed PR upstream: https://github.com/kubernetes-csi/external-snapshotter/pull/381

Comment 8 Qin Ping 2020-10-09 06:45:55 UTC
verified with: 4.6.0-0.nightly-2020-10-08-210814

Comment 10 errata-xmlrpc 2020-10-27 16:45:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.