Bug 1744453

Summary: Need a clear event for local storage operator when not finding the device defined in localvolume
Product: OpenShift Container Platform Reporter: Qin Ping <piqin>
Component: StorageAssignee: Christian Huffman <chuffman>
Status: CLOSED ERRATA QA Contact: Liang Xia <lxia>
Severity: low Docs Contact:
Priority: low    
Version: 4.2.0CC: aos-bugs, aos-storage-staff, jsafrane, lxia
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-04 11:13:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1783863    
Bug Blocks:    

Description Qin Ping 2019-08-22 08:26:31 UTC
Description of problem:
Need a clear event for local storage operator when not finding the device defined in localvolume

Version-Release number of selected component (if applicable):
quay.io/openshift-release-dev/ocp-v4.0-art-dev:v4.2.0-201908181300-ose-local-storage-operator

How reproducible:
100%

Steps to Reproduce:
1. Installed local storage operator in OCP
2. Created localvolume instance to deploy local storage provider
3. Updated the localvolume instance with a non-exist device (/dev/loop10 does not exist)
$ oc get localvolume local-disks -ojson|jq .spec.storageClassDevices
[
  {
    "devicePaths": [
      "/dev/loop1",
      "/dev/loop2",
      "/dev/loop10"
    ],
    "fsType": "xfs",
    "storageClassName": "local-sc",
    "volumeMode": "Filesystem"
  },
  {
    "devicePaths": [
      "/dev/loop3",
      "/dev"
    ],
    "fsType": "ext4",
    "storageClassName": "local-ext4",
    "volumeMode": "Filesystem"
  }
]
4. Check the event

Actual results:
1. The event only said "found empty matching device list", this is not clear.
2. On all worker nodes, /dev/loop10 does not exist, but there is only one event from one worker node.

$ oc describe localvolume|grep Events -A 12
Events:
  Type     Reason                    Age    From                     Message
  ----     ------                    ----   ----                     -------
  Warning  ErrorFindingMatchingDisk  22m    local-storage-diskmaker  ip-10-0-152-60.us-east-2.compute.internal - found empty matching device list
  Warning  ErrorFindingMatchingDisk  22m    local-storage-diskmaker  ip-10-0-173-82.us-east-2.compute.internal - found empty matching device list
  Normal   FoundMatchingDisk         21m    local-storage-diskmaker  ip-10-0-133-153.us-east-2.compute.internal - found matching disk loop3
  Warning  ErrorFindingMatchingDisk  19m    local-storage-diskmaker  ip-10-0-173-82.us-east-2.compute.internal - found empty matching device list
  Warning  ErrorFindingMatchingDisk  19m    local-storage-diskmaker  ip-10-0-152-60.us-east-2.compute.internal - found empty matching device list
  Normal   FoundMatchingDisk         16m    local-storage-diskmaker  ip-10-0-133-153.us-east-2.compute.internal - found matching disk loop3
  Warning  ErrorFindingMatchingDisk  15m    local-storage-diskmaker  ip-10-0-173-82.us-east-2.compute.internal - found empty matching device list
  Normal   FoundMatchingDisk         15m    local-storage-diskmaker  ip-10-0-152-60.us-east-2.compute.internal - found matching disk loop3
  Warning  ErrorFindingMatchingDisk  12m    local-storage-diskmaker  ip-10-0-173-82.us-east-2.compute.internal - found empty matching device list
  Warning  ErrorFindingMatchingDisk  9m10s  local-storage-diskmaker  ip-10-0-173-82.us-east-2.compute.internal - found empty matching device list

Expected results:
The event can describe which node does not have which device.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Christian Huffman 2019-10-08 01:38:50 UTC
I've added an error whenever any disk cannot be found. Previously the node would not report an error if at least one disk was successfully found. For instance:

> $ oc describe localvolume | grep Events -A 12
> Events:
>   Type     Reason                    Age    From                     Message
>   ----     ------                    ----   ----                     -------
>   Warning  ErrorFindingMatchingDisk  10m    local-storage-diskmaker  ip-10-0-137-243.us-east-2.compute.internal - unable to find matching disk /dev/xvdg
>   Warning  ErrorFindingMatchingDisk  2m10s  local-storage-diskmaker  ip-10-0-150-160.us-east-2.compute.internal - unable to find matching disk /dev/xvdf
>   Warning  ErrorFindingMatchingDisk  2m10s  local-storage-diskmaker  ip-10-0-150-160.us-east-2.compute.internal - unable to find matching disk /dev/xvdg
>   Warning  ErrorFindingMatchingDisk  2m10s  local-storage-diskmaker  ip-10-0-150-160.us-east-2.compute.internal - found empty matching device list

This is submitted at https://github.com/openshift/local-storage-operator/pull/51 .

Comment 3 Liang Xia 2019-12-19 06:52:47 UTC
Verified with local-storage-operator.4.4.0-201912170523 on 4.4.0-0.nightly-2019-12-18-064553

$ oc describe localvolume | grep Events -A 12
Events:
  Type     Reason                    Age    From                     Message
  ----     ------                    ----   ----                     -------
  Normal   FoundMatchingDisk         33m    local-storage-diskmaker  lxia19-48fdq-worker-dsf5s - found matching disk dm-1
  Warning  ErrorFindingMatchingDisk  33m    local-storage-diskmaker  lxia19-48fdq-worker-6fgxw - unable to add disk-id /dev/myvg01/mylv01 to local disk pool: unable to find device with id /dev/myvg01/mylv01
  Warning  ErrorFindingMatchingDisk  33m    local-storage-diskmaker  lxia19-48fdq-worker-6fgxw - found empty matching device list
  Warning  ErrorFindingMatchingDisk  33m    local-storage-diskmaker  lxia19-48fdq-worker-2pj5w - unable to add disk-id /dev/myvg01/mylv01 to local disk pool: unable to find device with id /dev/myvg01/mylv01
  Warning  ErrorFindingMatchingDisk  33m    local-storage-diskmaker  lxia19-48fdq-worker-2pj5w - found empty matching device list
  Warning  ErrorFindingMatchingDisk  2m17s  local-storage-diskmaker  lxia19-48fdq-worker-6fgxw - unable to find matching disk /dev/i-am-not-exist
  Warning  ErrorFindingMatchingDisk  2m17s  local-storage-diskmaker  lxia19-48fdq-worker-6fgxw - unable to add disk-id /dev/myvg01/mylv01 to local disk pool: unable to find device with id /dev/myvg01/mylv01
  Warning  ErrorFindingMatchingDisk  2m17s  local-storage-diskmaker  lxia19-48fdq-worker-6fgxw - found empty matching device list
  Warning  ErrorFindingMatchingDisk  2m4s   local-storage-diskmaker  lxia19-48fdq-worker-2pj5w - unable to find matching disk /dev/i-am-not-exist
  Warning  ErrorFindingMatchingDisk  2m4s   local-storage-diskmaker  lxia19-48fdq-worker-2pj5w - unable to add disk-id /dev/myvg01/mylv01 to local disk pool: unable to find device with id /dev/myvg01/mylv01

Comment 5 errata-xmlrpc 2020-05-04 11:13:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581