Description of problem: This is a NetApp trident setup, using a virtual netapp ontap device. We have been seeing random fsck issues with netapp block causing corruption issue: Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: I0416 12:15:04.470996 17920 mount_linux.go:488] `fsck` error fsck from util-linux 2.23.2 Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: fsck.ext2: Bad magic number in super-block while trying to open /dev/mapper/3600a09805a506576375d4f4e754d5434 Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: /dev/mapper/3600a09805a506576375d4f4e754d5434: Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: The superblock could not be read or does not describe a correct ext2 Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: filesystem. If the device is valid and it really contains an ext2 Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: filesystem (and not swap or ufs or something else), then the superblock Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: is corrupt, and you might try running e2fsck with an alternate superblock: Apr 16 12:15:04 server.dmz atomic-openshift-node[17920]: e2fsck -b 8193 <device> Looking at the device, we could see it already had data and a filesystem on it which is why fsck in mount_linux.go was failing. We tried to reproduce the issue and collect data by running the following in a loop until it failed: 1) Collect pre logs 2) Delete PVC / PV 3) Collect deleted logs 4) Create PVC / PV 5) Collect created logs 6) Scale up pod 7) Wait for success / error 8) Scale down pod 9) Collect logs What we found was the following: 1) Before the failure mount, the device already existed: dmsetup ls (pre creation): 3600a09805a506576375d4f4e754d5434 (253:50) 2) Before the failure mount, the dm-50 already existed in multipath: multipath (pre creation): 3600a09805a506576375d4f4e754d5434 dm-50 NETAPP ,LUN C-Mode size=5.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | `- 462:0:0:49 sdhe 133:64 active ready running `-+- policy='service-time 0' prio=10 status=enabled `- 461:0:0:49 sdbh 67:176 active ready running 3) Before the failure mount, sdhe and sdbh existed lsscsi (pre creation) [462:0:0:49] disk NETAPP LUN C-Mode 9600 /dev/sdhe [461:0:0:49] disk NETAPP LUN C-Mode 9600 /dev/sdbh 4) After deleting the old PVC / PV, the device was not removed: dmsetup ls (pvc deleted): 3600a09805a506576375d4f4e754d5434 (253:50) 5) After deleting the old PVC / PV, dm-50 still existed and went into an active/faulty/running state multipath (pvc deleted): 3600a09805a506576375d4f4e754d5434 dm-50 NETAPP ,LUN C-Mode size=5.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=0 status=active | `- 462:0:0:49 sdhe 133:64 active faulty running `-+- policy='service-time 0' prio=0 status=enabled `- 461:0:0:49 sdbh 67:176 active faulty running 6) After deleting the old PVC / PV, sdhe and sdbh still existed lsscsi (pvc deleted) [462:0:0:49] disk NETAPP LUN C-Mode 9600 /dev/sdhe [461:0:0:49] disk NETAPP LUN C-Mode 9600 /dev/sdbh 7) On the netapp, LUN 49 was removed 8) After recreating the PVC / PV, it was still there dmsetup ls (pvc create): 3600a09805a506576375d4f4e754d5434 (253:50) 9) After recreating the PVC / PV, it reconnected to dm-50 / 3600a09805a506576375d4f4e754d5434 and back to active/ready/running multipath (pvc create): 3600a09805a506576375d4f4e754d5434 dm-50 NETAPP ,LUN C-Mode size=5.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | `- 462:0:0:49 sdhe 133:64 active ready running `-+- policy='service-time 0' prio=10 status=enabled `- 461:0:0:49 sdbh 67:176 active ready running 10) After recreating the PVC / PV, lsscsi is the same lsscsi (pvc created) [462:0:0:49] disk NETAPP LUN C-Mode 9600 /dev/sdhe [461:0:0:49] disk NETAPP LUN C-Mode 9600 /dev/sdbh 11) On the netapp, it created a new LUN49 Other interesting thing, the DM device is 5Gi, but the LUN / PVC are only 1Gi in size. Version-Release number of selected component (if applicable): 3.11.146 How reproducible: Random Steps to Reproduce: 1) Collect pre logs 2) Delete PVC / PV 3) Collect deleted logs 4) Create PVC / PV 5) Collect created logs 6) Scale up pod 7) Wait for success / error 8) Scale down pod 9) Collect logs 10) Repeat Actual results: Mostly successful, but randomly you will hit this issue. Expected results: Devices should not be getting reused. Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rh-test annotations: volume.beta.kubernetes.io/storage-class: netapp-block-standard spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi StorageClass Dump (if StorageClass used by PV/PVC): allowVolumeExpansion: true apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: creationTimestamp: 2019-11-25T18:43:15Z name: netapp-block-standard resourceVersion: "1171291242" selfLink: /apis/storage.k8s.io/v1/storageclasses/netapp-block-standard uid: 58473621-0fb3-11ea-abeb-1948765234cc parameters: backendType: ontap-san-economy provisioner: netapp.io/trident reclaimPolicy: Delete volumeBindingMode: Immediate Additional info: Some related issues / fixes that have happened: https://github.com/NetApp/trident/issues/101 https://github.com/NetApp/trident/issues/133 https://github.com/kubernetes/kubernetes/issues/59946 https://github.com/kubernetes/kubernetes/issues/60894
Verified with: 4.5.0-0.nightly-2020-05-25-052746
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409