Bug 1872079

Summary: Device state is "Available" in localvolumediscoveryresult when block mode pv is already provisioned
Product: OpenShift Container Platform Reporter: Chao Yang <chaoyang>
Component: StorageAssignee: Santosh Pillai <sapillai>
Storage sub component: Local Storage Operator QA Contact: Chao Yang <chaoyang>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: akhaitov, aos-bugs, hekumar, jsafrane
Version: 4.6Keywords: Reopened
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-26 15:28:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1882630    
Bug Blocks:    

Description Chao Yang 2020-08-25 01:26:59 UTC
Description of problem:

Description of problem:
Device state is "Available" in localvolumediscoveryresult when pv is already provisioned 

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-23-214712 
local-storage-operator.4.6.0-202008210209.p0

How reproducible:
Always

Steps to Reproduce:
1.Deploy LSO
2.Create localvolumediscovery
3.Create localvolumeset, some pvs are provisioned
4.Check localvolumediscoveryresult, device state is still "Available"
Actual results:
Device state is "Available" in localvolumediscoveryresult when pv is already provisioned and mounted

Expected results:
Device state should be  "NotAvailable" in localvolumediscoveryresult when pv is already provisioned and mounted

Additional info:

Comment 1 Hemant Kumar 2020-08-25 14:47:28 UTC
I think the problem is, once a device is added to discovery result as available, its status does not changes when a PV is provisioned or volume is mounted. We might have to reconcile the result again.

Comment 2 Santosh Pillai 2020-08-26 03:30:14 UTC
@Chao

Can you please check the discovery results after 5 minutes of PV being provisioned and mounted. Continuous discovery runs after every 5 minutes or immediately when a new device is added or removed. 

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0  58.6G  0 disk
`-sda1   8:1    0  58.6G  0 part /mnt/sda1
sdb      8:16   0    10G  0 disk
sdc      8:32   0    20G  0 disk
sdd      8:48   0    40G  0 disk /mnt/sda1/var/lib/kubelet/pods/0d750ab4-425b-4401-98cd-1f13514418bb/volumes/kubernete
sde      8:64   0    50G  0 disk /mnt/sda1/var/lib/kubelet/pods/cc224156-f982-4bd5-b41c-7465c26fb8b0/volumes/kubernete
sr0     11:0    1 173.6M  0 rom


Every 2.0s: kubectl get pv                                                                                               localhost.localdomain: Wed Aug 26 08:55:57 2020

NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                     STORAGECLASS   REASON   AGE
local-pv-324e1380   40Gi       RWO            Delete           Bound       default/task-pv-claim-2   belowsixty              21m
local-pv-363abbe7   50Gi       RWO            Delete           Bound       default/task-pv-claim     belowsixty              21m
local-pv-cdfebf45   10Gi       RWO            Delete           Available                             belowsixty              21m
local-pv-ea0add03   20Gi       RWO            Delete           Available                             belowsixty              17m



After 5 minutes. /dev/sdd and /dev/sde becomes NotAvailable

Name:         discovery-result-m01
Namespace:    default
Labels:       discovery-result-node=m01
Annotations:  <none>
API Version:  local.storage.openshift.io/v1alpha1
Kind:         LocalVolumeDiscoveryResult
Metadata:
  Creation Timestamp:  2020-08-26T02:03:12Z
  Generation:          1
  Owner References:
    API Version:     local.storage.openshift.io/v1alpha1
    Kind:            LocalVolumeDiscovery
    Name:            diskmaker-discovery-skgp5
    UID:             5cda82a6-b409-488e-8a19-4ce96ceceeac
  Resource Version:  11946
  Self Link:         /apis/local.storage.openshift.io/v1alpha1/namespaces/default/localvolumediscoveryresults/discovery-result-m01
  UID:               995fc10b-c54b-4e67-8101-37c812d43df2
Spec:
  Node Name:  m01
Status:
  Discovered Devices:
    Device ID:  /dev/disk/by-id/ata-VBOX_HARDDISK_____VB2aa19561-a5b02332
    Fstype:     ext4
    Model:      VBOX HARDDISK   
    Path:       /dev/sdd
    Property:   Rotational
    Serial:     VB2aa19561-a5b02332
    Size:       42949672960
    Status:
      State:    NotAvailable
    Type:       disk
    Vendor:     ATA     
    Device ID:  /dev/disk/by-id/ata-VBOX_HARDDISK_____VB3ec86ebf-ed3a5a93
    Fstype:     
    Model:      VBOX HARDDISK   
    Path:       /dev/sdb
    Property:   Rotational
    Serial:     VB3ec86ebf-ed3a5a93
    Size:       10737418240
    Status:
      State:    Available
    Type:       disk
    Vendor:     ATA     
    Device ID:  /dev/disk/by-id/ata-VBOX_HARDDISK_____VBfe136e82-12c6dc02
    Fstype:     ext4
    Model:      VBOX HARDDISK   
    Path:       /dev/sde
    Property:   Rotational
    Serial:     VBfe136e82-12c6dc02
    Size:       53687091200
    Status:
      State:    NotAvailable
    Type:       disk
    Vendor:     ATA     
    Device ID:  /dev/disk/by-id/ata-VBOX_HARDDISK_____VBfb9577b2-d1e52787
    Fstype:     
    Model:      VBOX HARDDISK   
    Path:       /dev/sdc
    Property:   Rotational
    Serial:     VBfb9577b2-d1e52787
    Size:       21474836480
    Status:
      State:    Available
    Type:       disk
    Vendor:     ATA     
    Device ID:  /dev/disk/by-id/ata-VBOX_HARDDISK_____VBb943cc43-c7b35185-part1
    Fstype:     ext4
    Model:      
    Path:       /dev/sda1
    Property:   Rotational
    Serial:     
    Size:       62913494528
    Status:
      State:              NotAvailable
    Type:                 part
    Vendor:               
  Discovered Time Stamp:  2020-08-26T03:28:07Z
Events:                   <none>

Comment 3 Chao Yang 2020-08-28 06:33:11 UTC
I still has this issue.

pod-with-block-volume                     1/1     Running   0          24m

local-pv-87e02a08   1Gi        RWO            Delete           Bound       openshift-local-storage/block-pvc   lvs                     30m

  Device ID:  /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0980193996dd5900d
    Fstype:     
    Model:      Amazon Elastic Block Store              
    Path:       /dev/nvme2n1
    Property:   NonRotational
    Serial:     vol0980193996dd5900d
    Size:       1073741824
    Status:
      State:    Available
    Type:       disk

Comment 4 Santosh Pillai 2020-08-30 02:37:20 UTC
Is this only happening in block mode PV or both block and Filesystem mode?

Comment 5 Chao Yang 2020-08-31 06:54:29 UTC
Only for block device.
Device ID:  /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0ceee561c070d6067-part2
    Fstype:     ext4
    Model:      
    Path:       /dev/nvme1n1p2
    Property:   NonRotational
    Serial:     
    Size:       1072693248
    Status:
      State:    NotAvailable
    Type:       part

Comment 7 Chao Yang 2020-09-25 08:20:40 UTC
 
Localvolumediscovery can detect volume is notavailable, but after I delete pod/pvc, waiting for pv is available, status in localvolumediscoveryresult is not not updated.

 - deviceID: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol024547723ded32e1d
    fstype: ext4
    model: 'Amazon Elastic Block Store              '
    path: /dev/nvme1n1
    property: NonRotational
    serial: vol024547723ded32e1d
    size: 10737418240
    status:
      state: NotAvailable
    type: disk
    vendor: ""
  discoveredTimeStamp: "2020-09-25T06:03:56Z"

--------------
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  local:
    path: /mnt/local-storage/lvs/nvme-Amazon_Elastic_Block_Store_vol024547723ded32e1d
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ip-10-0-143-32
  persistentVolumeReclaimPolicy: Delete
  storageClassName: lvs
  volumeMode: Filesystem
status:
  phase: Available

  - deviceID: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol024547723ded32e1d
    fstype: ext2
    model: 'Amazon Elastic Block Store              '
    path: /dev/nvme1n1
    property: NonRotational
    serial: vol024547723ded32e1d
    size: 10737418240
    status:
      state: NotAvailable
    type: disk
    vendor: ""
  discoveredTimeStamp: "2020-09-25T06:13:56Z"

Comment 8 Santosh Pillai 2020-09-25 09:18:14 UTC
@Chao The disk `/dev/nvme1n1` mentioned in the example above has a filesystem on it `ext2`. That's why is showing a `Not Available`.

Comment 9 Santosh Pillai 2020-09-25 09:22:41 UTC
Moving it back to on_QA

Comment 10 Chao Yang 2020-09-28 09:18:35 UTC
blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1882630

Comment 11 Santosh Pillai 2020-09-29 07:44:29 UTC
I was able to retest the issue with this BZ. 

1. Create pv using localvolume set

2. Create pvc
```
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: block-pvc
  #namespace: openshift-local-storage
spec:
  storageClassName: belowsixty
  accessModes:
    - ReadWriteOnce
  volumeMode: Block
  resources:
    requests:
      storage: 50Gi
```

3. Create pod with block volume
   ```
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-block-volume
  #namespace: openshift-local-storage
spec:
  containers:
    - name: some-container
      image: alpine
      command: ["/bin/sh", "-c"]
      args: [ "tail -f /dev/null" ]
      volumeDevices:
        - name: data
          devicePath: /dev/nvme1n1
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: block-pvc
```

4. Observe result 

   ```
Every 2.0s: oc get pv                                                           localhost.localdomain: Tue Sep 29 12:47:08 2020

NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
local-pv-78a2938c   50Gi       RWO            Delete           Bound    default/block-pvc   belowsixty              3m53s

```

Discovery Result:

  - deviceID: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0d2b8f7ad1e6ec769
    fstype: ""
    model: 'Amazon Elastic Block Store              '
    path: /dev/nvme1n1
    property: NonRotational
    serial: vol0d2b8f7ad1e6ec769
    size: 53687091200
    status:
      state: NotAvailable
    type: disk
    vendor: ""
  discoveredTimeStamp: "2020-09-29T07:15:35Z"

5. Delete POD

6. Observe result again. 

  - deviceID: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0d2b8f7ad1e6ec769
    fstype: ""
    model: 'Amazon Elastic Block Store              '
    path: /dev/nvme1n1
    property: NonRotational
    serial: vol0d2b8f7ad1e6ec769
    size: 53687091200
    status:
      state: Available
    type: disk
    vendor: ""
  discoveredTimeStamp: "2020-09-29T07:20:35Z"

Comment 13 Chao Yang 2020-10-10 06:08:36 UTC
1.Create pvc/pod
  - deviceID: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0b9b38a379841f8e0
    fstype: ext4
    model: 'Amazon Elastic Block Store              '
    path: /dev/nvme2n1
    property: NonRotational
    serial: vol0b9b38a379841f8e0
    size: 1073741824
    status:
      state: NotAvailable
    type: disk
    vendor: ""
  discoveredTimeStamp: "2020-10-10T05:42:06Z"

2.Delete pod/pvc
  - deviceID: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0b9b38a379841f8e0
    fstype: ""
    model: 'Amazon Elastic Block Store              '
    path: /dev/nvme2n1
    property: NonRotational
    serial: vol0b9b38a379841f8e0
    size: 1073741824
    status:
      state: Available
    type: disk
    vendor: ""
  discoveredTimeStamp: "2020-10-10T05:52:06Z"

 oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-10-08-210814   True        False         20h     Cluster version is 4.6.0-0.nightly-2020-10-08-210814

Comment 15 errata-xmlrpc 2020-10-27 16:32:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Comment 17 Anna Khaitovich 2020-11-26 15:28:24 UTC
Please ignore the comment above, sorry