Bug 1976744 - "MachineNotYetDeleted" in Pending state , alert not fired
Summary: "MachineNotYetDeleted" in Pending state , alert not fired
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.9.0
Assignee: Michael McCune
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks: 1986237
TreeView+ depends on / blocked
 
Reported: 2021-06-28 06:53 UTC by Milind Yadav
Modified: 2021-07-27 03:16 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1986237 (view as bug list)
Environment:
Last Closed: 2021-06-29 14:39:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Milind Yadav 2021-06-28 06:53:52 UTC
[4.9]  "MachineNotYetDeleted" in Pending state alert not fired

Version - Cluster version is 4.9.0-0.nightly-2021-06-25-050351
Requirement - https://issues.redhat.com/browse/OCPCLOUD-921

Steps :
1. Create a PDB refer below 

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: pdb1
spec:
  minAvailable: 7
  selector:
    matchLabels:
      app: nginx

Pdb created successfully 
2.create deployment which have same replicas as minAvailable in the PDB refer below :

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "dep1"
spec:
  replicas: 7
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
     containers:
     -
       name: "myfrontend"
       image: "quay.io/openshifttest/hello-openshift@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e"
       ports:
        -
          containerPort: 80
          name: "http-server"
Deployment created successfully 
3.Delete worker machine running pods
Expected - machine stuck in deleting phase , after 6 hrs Alert “MachineNotYetDeleted” is fired
Actual - machine stuck in deleting phase but after 6 hrs Alert “MachineNotYetDeleted” is in pending state

Comment 4 Milind Yadav 2021-06-29 14:08:44 UTC
After reviewing the with monitoring team , and reviewing steps , figured out there were no inhibit_rules that can cause this , Did the test again and did not silenced any alerts , I could see both alerts were fired .
Mistake earlier was the silencing(PodDisruptionBudgetAlert) of alert right about 6 hrs 2 mins or so , when the alert "MachineNotYetDeleted" alert was fired .
Attaching the snap

Comment 6 Michael McCune 2021-06-29 14:39:45 UTC
thanks for the followup Milind. i am closing this as not a bug.


Note You need to log in before you can comment on or make changes to this bug.