Bug 1869618 - One of diskmaker-discovery pod is not deleted after delete the localvolumediscovery instance
Summary: One of diskmaker-discovery pod is not deleted after delete the localvolumedis...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Santosh Pillai
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-18 11:24 UTC by Chao Yang
Modified: 2020-10-27 16:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:29:05 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:29:38 UTC

Description Chao Yang 2020-08-18 11:24:19 UTC
Description of problem:
One of diskmaker-discovery pod is not deleted after delete the localvolumediscovery instance

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-16-072105
local-storage-operator.4.6.0-202008111832.p0 

How reproducible:
1 time

Steps to Reproduce:
1.Deploy LSO
2.Create Localvolumediscovery instance
3.All required pods are running.
4.Delete the localvolumeddiscovery
5.One of pod diskmaker-discovery is not deleted
6.Re-create localvolumediscovery, 3 pods of diskmaker-discovery are running.

oc get pods -o wide
NAME                                      READY   STATUS    RESTARTS   AGE     IP             NODE                                         NOMINATED NODE   READINESS GATES
diskmaker-discovery-6b7gq                 1/1     Running   0          11m     10.128.2.156   ip-10-0-137-29.us-east-2.compute.internal    <none>           <none>
diskmaker-discovery-fl4kq                 1/1     Running   0          25h     10.131.0.21    ip-10-0-178-125.us-east-2.compute.internal   <none>           <none>
diskmaker-discovery-k2ztb                 1/1     Running   0          11m     10.131.0.38    ip-10-0-178-125.us-east-2.compute.internal   <none>           <none>
diskmaker-discovery-nzrb7                 1/1     Running   0          11m     10.129.2.33    ip-10-0-206-82.us-east-2.compute.internal    <none>           <none>


$ oc get pod -n openshift-local-storage --show-labels
NAME                                      READY   STATUS    RESTARTS   AGE     LABELS
diskmaker-discovery-6b7gq                 1/1     Running   0          57m     app=diskmaker-discovery,controller-revision-hash=7cd6787664,pod-template-generation=3
diskmaker-discovery-fl4kq                 1/1     Running   0          26h     app=diskmaker-discovery,controller-revision-hash=7587f56bd,pod-template-generation=1
diskmaker-discovery-k2ztb                 1/1     Running   0          58m     app=diskmaker-discovery,controller-revision-hash=7cd6787664,pod-template-generation=3
diskmaker-discovery-nzrb7                 1/1     Running   0          57m     app=diskmaker-discovery,controller-revision-hash=7cd6787664,pod-template-generation=3

oc logs pod/local-storage-operator-68f4dd987f-wcp5m 
{"level":"error","ts":1597740923.8552113,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"localvolumediscovery-controller","request":"openshift-local-storage/auto-discover-devices","error":"running 2 out of 3 discovery daemons","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/local-storage-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}


Actual results:
Pod diskmaker-discovery-fl4kq is not deleted when delete localvolummediscovery

Expected results:
Pod diskmaker-discovery-fl4kq should be deleted when delete localvolummediscovery

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Santosh Pillai 2020-08-20 11:35:33 UTC
@Chao 
I was not able to reproduce this issue.  In my case all the daemonset pods and LocalVolumeDisoveryResults got deleted successfully on deleting the LocalVolumeDiscovery CR.
Can you share the cluster where this issue was reproduced, if you still have it.

Comment 2 Santosh Pillai 2020-08-30 02:35:16 UTC
@Chao - Any updates on this?

Comment 3 Chao Yang 2020-08-31 01:06:09 UTC
I only meet this once.
Could not reproduce it right now.

Comment 4 Santosh Pillai 2020-09-07 09:58:06 UTC
@Chao. Still not able to reproduce this bug. Moving it back to on_QA. Let me know if its still happening.

Comment 5 Chao Yang 2020-09-08 09:11:12 UTC
Update bz status since not reproduce.

Comment 8 errata-xmlrpc 2020-10-27 16:29:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.