1869618 – One of diskmaker-discovery pod is not deleted after delete the localvolumediscovery instance

Bug 1869618 - One of diskmaker-discovery pod is not deleted after delete the localvolumediscovery instance

Summary: One of diskmaker-discovery pod is not deleted after delete the localvolumedis...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Santosh Pillai
QA Contact:	Chao Yang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-08-18 11:24 UTC by Chao Yang
Modified:	2020-10-27 16:29 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-10-27 16:29:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 16:29:38 UTC

Description Chao Yang 2020-08-18 11:24:19 UTC

Description of problem:
One of diskmaker-discovery pod is not deleted after delete the localvolumediscovery instance

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-16-072105
local-storage-operator.4.6.0-202008111832.p0 

How reproducible:
1 time

Steps to Reproduce:
1.Deploy LSO
2.Create Localvolumediscovery instance
3.All required pods are running.
4.Delete the localvolumeddiscovery
5.One of pod diskmaker-discovery is not deleted
6.Re-create localvolumediscovery, 3 pods of diskmaker-discovery are running.

oc get pods -o wide
NAME                                      READY   STATUS    RESTARTS   AGE     IP             NODE                                         NOMINATED NODE   READINESS GATES
diskmaker-discovery-6b7gq                 1/1     Running   0          11m     10.128.2.156   ip-10-0-137-29.us-east-2.compute.internal    <none>           <none>
diskmaker-discovery-fl4kq                 1/1     Running   0          25h     10.131.0.21    ip-10-0-178-125.us-east-2.compute.internal   <none>           <none>
diskmaker-discovery-k2ztb                 1/1     Running   0          11m     10.131.0.38    ip-10-0-178-125.us-east-2.compute.internal   <none>           <none>
diskmaker-discovery-nzrb7                 1/1     Running   0          11m     10.129.2.33    ip-10-0-206-82.us-east-2.compute.internal    <none>           <none>


$ oc get pod -n openshift-local-storage --show-labels
NAME                                      READY   STATUS    RESTARTS   AGE     LABELS
diskmaker-discovery-6b7gq                 1/1     Running   0          57m     app=diskmaker-discovery,controller-revision-hash=7cd6787664,pod-template-generation=3
diskmaker-discovery-fl4kq                 1/1     Running   0          26h     app=diskmaker-discovery,controller-revision-hash=7587f56bd,pod-template-generation=1
diskmaker-discovery-k2ztb                 1/1     Running   0          58m     app=diskmaker-discovery,controller-revision-hash=7cd6787664,pod-template-generation=3
diskmaker-discovery-nzrb7                 1/1     Running   0          57m     app=diskmaker-discovery,controller-revision-hash=7cd6787664,pod-template-generation=3

oc logs pod/local-storage-operator-68f4dd987f-wcp5m 
{"level":"error","ts":1597740923.8552113,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"localvolumediscovery-controller","request":"openshift-local-storage/auto-discover-devices","error":"running 2 out of 3 discovery daemons","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/local-storage-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}


Actual results:
Pod diskmaker-discovery-fl4kq is not deleted when delete localvolummediscovery

Expected results:
Pod diskmaker-discovery-fl4kq should be deleted when delete localvolummediscovery

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Santosh Pillai 2020-08-20 11:35:33 UTC

@Chao 
I was not able to reproduce this issue.  In my case all the daemonset pods and LocalVolumeDisoveryResults got deleted successfully on deleting the LocalVolumeDiscovery CR.
Can you share the cluster where this issue was reproduced, if you still have it.

Comment 2 Santosh Pillai 2020-08-30 02:35:16 UTC

@Chao - Any updates on this?

Comment 3 Chao Yang 2020-08-31 01:06:09 UTC

I only meet this once.
Could not reproduce it right now.

Comment 4 Santosh Pillai 2020-09-07 09:58:06 UTC

@Chao. Still not able to reproduce this bug. Moving it back to on_QA. Let me know if its still happening.

Comment 5 Chao Yang 2020-09-08 09:11:12 UTC

Update bz status since not reproduce.

Comment 8 errata-xmlrpc 2020-10-27 16:29:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.