Bug 1976744

Summary: "MachineNotYetDeleted" in Pending state , alert not fired
Product: OpenShift Container Platform Reporter: Milind Yadav <miyadav>
Component: Cloud ComputeAssignee: Michael McCune <mimccune>
Cloud Compute sub component: Other Providers QA Contact: sunzhaohua <zhsun>
Status: CLOSED NOTABUG Docs Contact:
Severity: high    
Priority: unspecified CC: mimccune
Version: 4.9   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1986237 (view as bug list) Environment:
Last Closed: 2021-06-29 14:39:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1986237    

Description Milind Yadav 2021-06-28 06:53:52 UTC
[4.9]  "MachineNotYetDeleted" in Pending state alert not fired

Version - Cluster version is 4.9.0-0.nightly-2021-06-25-050351
Requirement - https://issues.redhat.com/browse/OCPCLOUD-921

Steps :
1. Create a PDB refer below 

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: pdb1
spec:
  minAvailable: 7
  selector:
    matchLabels:
      app: nginx

Pdb created successfully 
2.create deployment which have same replicas as minAvailable in the PDB refer below :

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "dep1"
spec:
  replicas: 7
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
     containers:
     -
       name: "myfrontend"
       image: "quay.io/openshifttest/hello-openshift@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e"
       ports:
        -
          containerPort: 80
          name: "http-server"
Deployment created successfully 
3.Delete worker machine running pods
Expected - machine stuck in deleting phase , after 6 hrs Alert “MachineNotYetDeleted” is fired
Actual - machine stuck in deleting phase but after 6 hrs Alert “MachineNotYetDeleted” is in pending state

Comment 4 Milind Yadav 2021-06-29 14:08:44 UTC
After reviewing the with monitoring team , and reviewing steps , figured out there were no inhibit_rules that can cause this , Did the test again and did not silenced any alerts , I could see both alerts were fired .
Mistake earlier was the silencing(PodDisruptionBudgetAlert) of alert right about 6 hrs 2 mins or so , when the alert "MachineNotYetDeleted" alert was fired .
Attaching the snap

Comment 6 Michael McCune 2021-06-29 14:39:45 UTC
thanks for the followup Milind. i am closing this as not a bug.