1903820 – Performamce Profile does not update status when MCP goes into degraded state

Bug 1903820 - Performamce Profile does not update status when MCP goes into degraded state

Summary: Performamce Profile does not update status when MCP goes into degraded state

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Performance Addon Operator
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Martin Sivák
QA Contact:	Gowrishankar Rajaiyan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-12-02 21:31 UTC by Denys Shchedrivyi
Modified:	2022-08-26 13:56 UTC (History)
CC List:	4 users (show)
Fixed In Version:	performance-addon-operator-container-v4.7.0-18
Doc Type:	Bug Fix
Doc Text:	Cause: Watching at the owned machine config pool, when no machine config pool is owned by the performance profile. Consequence: The performance profile did not have an updated status regarding the machine config pool state. Fix: Watch at machine config pools that referred to by the performance profile node selector or machine config pool selector. Result: We have updated the state of the selected machine config pool under the performance profile status.
Clone Of:
Environment:
Last Closed:	2022-08-26 13:56:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift-kni/performance-addon-operators/blob/master/functests/3_performance_status/status.go#L94	0	None	None	None	2021-07-02 10:40:13 UTC
Github	openshift-kni performance-addon-operators pull 479	0	None	closed	Bug 1903820: watch machine config pools not owned by the performance profile	2021-02-19 21:15:11 UTC

Description Denys Shchedrivyi 2020-12-02 21:31:50 UTC

Description of problem:
 When MCP is degraded - Performance profile does not catch it and does not show the error message:

# oc get mcp
NAME         CONFIG                                                 UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker-cnf   rendered-worker-cnf-f3ee0715b65b7bdddb637e3bbf640ce3   False     True       True       1              0                   0                     1                      174m

# oc describe mcp worker-cnf
.
    Last Transition Time:  2020-12-02T21:13:37Z
    Message:               Node cnf-den-nngpc-worker-0-4qv7w is reporting: "can't reconcile config rendered-worker-cnf-f3ee0715b65b7bdddb637e3bbf640ce3 with rendered-worker-cnf-eea4ca47f8b99a156057f933321e2e25: ignition disks section contains changes: unreconcilable"
    Reason:                1 nodes are reporting degraded status on sync
    Status:                True
    Type:                  NodeDegraded
    Last Transition Time:  2020-12-02T21:13:37Z
    Message:               
    Reason:                
    Status:                True
    Type:                  Degraded


# oc describe node cnf-den-nngpc-worker-0-4qv7w
.
Annotations:
       .
       machineconfiguration.openshift.io/reason:
          can't reconcile config rendered-worker-cnf-f3ee0715b65b7bdddb637e3bbf640ce3 with rendered-worker-cnf-eea4ca47f8b99a156057f933321e2e25: ign...
       machineconfiguration.openshift.io/state: Unreconcilable



# oc describe performanceprofile manual
Status:
  Conditions:
.
    Status:                False
    Type:                  Degraded



Version-Release number of selected component (if applicable):
4.6, 4.7


Steps to Reproduce:
1. Set mcp into degraded state (for example by creating some wrong MC)
2. Check MCP, node and performance profile


Actual results:
 MCP in degraded state, Node has the reason in annotations, but Performance Profile shows Degraded=False


Expected results:
 Performance Profile has right MCP status

Comment 1 Denys Shchedrivyi 2020-12-02 22:42:53 UTC

after manually updating profile (removing some unnecessary lines) degraded message appeared:

initially I have degraded=false
># oc get performanceprofile manual -o jsonpath={.status.conditions[3]}
>{"lastHeartbeatTime":"2020-12-02T22:10:19Z","lastTransitionTime":"2020-12-02T22:10:19Z","status":"False","type":"Degraded"}[☘️ glip-rh ~/.../openshift-kni/performance-addon-operators$] 

editing profile and removing some unnecessary staff:
># oc edit performanceprofile
>performanceprofile.performance.openshift.io/manual edited

degraded message appeared in profile:
># oc get performanceprofile manual -o jsonpath={.status.conditions[3]}
>{"lastHeartbeatTime":"2020-12-02T22:29:52Z","lastTransitionTime":"2020-12-02T22:29:52Z","message":"Machine config pool worker-cnf Degraded Reason: 1 nodes are reporting degraded status on sync.\nMachine config pool worker-cnf Degraded Message: Node cnf-den-nngpc-worker-0-4qv7w is reporting: \"can't reconcile config rendered-worker-cnf-f3ee0715b65b7bdddb637e3bbf640ce3 with rendered-worker-cnf-eea4ca47f8b99a156057f933321e2e25: ignition disks section contains changes: unreconcilable\".\n","reason":"MCPDegraded","status":"True","type":"Degraded"}

Comment 3 Denys Shchedrivyi 2021-01-07 20:49:46 UTC

 Status in profile updated, but for some reason the message is duplicated:

> # oc describe performanceprofile
>.
>    Message:               Machine config pool worker-cnf Degraded Reason: 1 nodes are reporting degraded status on sync.
>Machine config pool worker-cnf Degraded Message: Node ocp47rtfix-worker-0.demo.lab.den is reporting: "can't reconcile config rendered-worker-cnf-66805b1fc1b445e249355a7955da4e9b with rendered-worker-cnf-8102e3c868d464cb299e26b9e45c307a: ignition disks section contains changes: unreconcilable".
>Machine config pool worker-cnf Degraded Reason: 1 nodes are reporting degraded status on sync.
>Machine config pool worker-cnf Degraded Message: Node ocp47rtfix-worker-0.demo.lab.den is reporting: "can't reconcile config rendered-worker-cnf-66805b1fc1b445e249355a7955da4e9b with rendered-worker-cnf-8102e3c868d464cb299e26b9e45c307a: ignition disks section contains changes: unreconcilable".
>    Reason:       MCPDegraded
>    Status:       True
>    Type:         Degraded

Comment 4 Artyom 2021-01-10 11:21:48 UTC

Hm now I remember why we had the remove duplication method for MCPS, I will create an additional PR.

Comment 5 Denys Shchedrivyi 2021-01-22 19:42:44 UTC

Verified on performance-addon-operator-container-v4.7.0-24

Note You need to log in before you can comment on or make changes to this bug.