Bug 1732609 - openshift-machine-api nodeline-controller stuck in tight loop
Summary: openshift-machine-api nodeline-controller stuck in tight loop
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.4.0
Assignee: Alberto
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-23 21:17 UTC by Michael Gugino
Modified: 2020-02-21 09:39 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-21 09:39:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michael Gugino 2019-07-23 21:17:42 UTC
Description of problem:

Nodelink controller never rests.  With a very small amount of nodes and machines, it is constantly emitting dozens of logs ever second, forever.

We should figure out why it's burning up some much CPU time and possibly have it trigger actions on node/machine update events.

Comment 1 Alberto 2019-07-24 07:29:25 UTC
This is by design atm. It's triggering actions on node/machine update events which happen very frequently due to node heartbeating status updates. Once we make machines "fire and forget" we can let the nodelink controller to stop watching nodes and only watch machines which will reduce the reconciliation cadence

Comment 2 Alberto 2020-02-21 09:39:08 UTC
I'm closing this for now as per https://bugzilla.redhat.com/show_bug.cgi?id=1732609#c1.
In the near future we'll hopefylly drop the node link controller completekly, move the linking logic in to the machine controller and probably slighty reducing its resync period.


Note You need to log in before you can comment on or make changes to this bug.