1976744 – "MachineNotYetDeleted" in Pending state , alert not fired

Bug 1976744 - "MachineNotYetDeleted" in Pending state , alert not fired

Summary: "MachineNotYetDeleted" in Pending state , alert not fired

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cloud Compute
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.9.0
Assignee:	Michael McCune
QA Contact:	sunzhaohua
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1986237
TreeView+	depends on / blocked

Reported:	2021-06-28 06:53 UTC by Milind Yadav
Modified:	2021-07-27 03:16 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1986237 (view as bug list)
Environment:
Last Closed:	2021-06-29 14:39:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Milind Yadav 2021-06-28 06:53:52 UTC

[4.9]  "MachineNotYetDeleted" in Pending state alert not fired

Version - Cluster version is 4.9.0-0.nightly-2021-06-25-050351
Requirement - https://issues.redhat.com/browse/OCPCLOUD-921

Steps :
1. Create a PDB refer below 

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: pdb1
spec:
  minAvailable: 7
  selector:
    matchLabels:
      app: nginx

Pdb created successfully 
2.create deployment which have same replicas as minAvailable in the PDB refer below :

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "dep1"
spec:
  replicas: 7
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
     containers:
     -
       name: "myfrontend"
       image: "quay.io/openshifttest/hello-openshift@sha256:aaea76ff622d2f8bcb32e538e7b3cd0ef6d291953f3e7c9f556c1ba5baf47e2e"
       ports:
        -
          containerPort: 80
          name: "http-server"
Deployment created successfully 
3.Delete worker machine running pods
Expected - machine stuck in deleting phase , after 6 hrs Alert “MachineNotYetDeleted” is fired
Actual - machine stuck in deleting phase but after 6 hrs Alert “MachineNotYetDeleted” is in pending state

Comment 4 Milind Yadav 2021-06-29 14:08:44 UTC

After reviewing the with monitoring team , and reviewing steps , figured out there were no inhibit_rules that can cause this , Did the test again and did not silenced any alerts , I could see both alerts were fired .
Mistake earlier was the silencing(PodDisruptionBudgetAlert) of alert right about 6 hrs 2 mins or so , when the alert "MachineNotYetDeleted" alert was fired .
Attaching the snap

Comment 6 Michael McCune 2021-06-29 14:39:45 UTC

thanks for the followup Milind. i am closing this as not a bug.

Note You need to log in before you can comment on or make changes to this bug.