Bug 2019453 - Stale PDBs do not get reconciled triggering continuous PDB alerts
Summary: Stale PDBs do not get reconciled triggering continuous PDB alerts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.8.2
Hardware: Unspecified
OS: Unspecified
urgent
medium
Target Milestone: ---
: 4.8.4
Assignee: Antonio Cardace
QA Contact: zhe peng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-02 15:15 UTC by Antonio Cardace
Modified: 2025-10-03 11:19 UTC (History)
5 users (show)

Fixed In Version: virt-operator-container-v4.8.4-4 hco-bundle-registry-container-v4.8.4-11
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-20 17:21:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 6723 0 None open pdb: reconcile all old pdbs 2021-11-03 15:23:07 UTC
Github kubevirt kubevirt pull 6815 0 None Merged [release-0.41] pdb: reconcile all old pdbs 2021-11-30 17:30:21 UTC
Red Hat Issue Tracker CNV-14734 0 None None None 2023-01-20 10:34:56 UTC
Red Hat Product Errata RHBA-2022:0213 0 None None None 2022-01-20 17:21:33 UTC

Description Antonio Cardace 2021-11-02 15:15:50 UTC
Description of problem:

When upgrading from CNV 4.8.0 to 4.8.2 if there are running VMIs with 'EvictionStrategy: LiveMigrate' the associated disruption budgets will keep existing forever since the VMI-PDB logic changed in 4.8.1 and those old PDBs do not get properly reconciled, this might continuously trigger alerts about not having enough pods compared to the pods the PDB expects to protect (as until 4.8.1 the logic was to create a single PDB with 'MinAvailable: 2' at all times).

The quick workaround is to delete all pdbs associated to running VMIs so that virt-controller will notice the deletions and re-create the PDBs with the correct values, to do that simply run:

(assuming the VMIs are in the default namespace)
kubectl delete pdb --all 

otherwise

kubectl -n $NAMESPACE delete pdb --all


Version-Release number of selected component (if applicable):


How reproducible:
Always


Steps to Reproduce:
1. Install CNV 4.8.0
2. Create a VMI with 'EvictionStrategy: LiveMigrate'
3. Upgrade to CNV 4.8.2

Actual results:
The VMI associated PDB has 'MinAvailable: 2'.


Expected results:
The VMI associated PDB should have 'MinAvailable: 1'.


Additional info:

Comment 1 Antonio Cardace 2021-11-03 13:48:13 UTC
Posted fix at https://github.com/kubevirt/kubevirt/pull/6723.

Comment 4 zhe peng 2021-12-15 09:28:01 UTC
verify with build:
hco-bundle-registry-container-v4.8.4-20

step:
1. deploy cnv4.8.3 cluster and create&run rhel8 vm
vm have: 'EvictionStrategy: LiveMigrate'
2. check status
$ oc get csv -n openshift-cnv
NAME                                      DISPLAY                    VERSION   REPLACES                                  PHASE
kubevirt-hyperconverged-operator.v4.8.3   OpenShift Virtualization   4.8.3     kubevirt-hyperconverged-operator.v4.8.2   Succeeded

$ oc get pdb
NAME                               MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
kubevirt-disruption-budget-z7j9f   1               N/A               0                     8m32s

3. upgrade cnv from 4.8.3 to 4.8.4-20
$ oc get ip -A
NAMESPACE                 NAME            CSV                                         APPROVAL    APPROVED
openshift-cnv             install-8tjvk   kubevirt-hyperconverged-operator.v4.8.3     Manual      true
openshift-cnv             install-g7vpx   kubevirt-hyperconverged-operator.v4.8.4     Manual      false
openshift-local-storage   install-tzbms   local-storage-operator.4.8.0-202111041632   Automatic   true
openshift-storage         install-zfqjz   ocs-operator.v4.8.6                         Automatic   true

$ oc get csv -n openshift-cnv
NAME                                      DISPLAY                    VERSION   REPLACES                                  PHASE
kubevirt-hyperconverged-operator.v4.8.4   OpenShift Virtualization   4.8.4     kubevirt-hyperconverged-operator.v4.8.3   Succeeded

4. check pdb
$ oc get pdb
NAME                               MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
kubevirt-disruption-budget-z7j9f   1               N/A               0                     110m
$ oc get vmi
NAME       AGE    PHASE     IP             NODENAME
vm-rhel8   110m   Running   10.129.2.159   virt03-9vn6t-worker-0-vcqwb

Comment 6 zhe peng 2021-12-15 09:33:25 UTC
per comment 4, vmi is 'MinAvailable: 1' after cnv upgrade, move to verified.

Comment 8 Fabian Deutsch 2022-01-10 13:39:44 UTC
@zpeng have we also checked that no alerts are firing anymore?

Comment 9 Fabian Deutsch 2022-01-10 13:50:15 UTC
I was thinking of rhbz#2026733, but this is a different bug

Comment 10 zhe peng 2022-01-12 11:27:16 UTC
His Fabian,

No, I didn't check that, just follow description of the bug.

Comment 15 errata-xmlrpc 2022-01-20 17:21:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 4.8.4 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0213


Note You need to log in before you can comment on or make changes to this bug.