Bug 2058256
| Summary: | LeaseDuration for NFD Operator seems to be rather small, causing Operator restarts when running etcd defrag | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Simon Reber <sreber> | |
| Component: | Node Feature Discovery Operator | Assignee: | Carlos Eduardo Arango Gutierrez <carangog> | |
| Status: | CLOSED ERRATA | QA Contact: | Lena Horsley <lhorsley> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.8 | CC: | scuppett, sejug | |
| Target Milestone: | --- | |||
| Target Release: | 4.11.0 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2065148 (view as bug list) | Environment: | ||
| Last Closed: | 2022-08-10 10:23:42 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2065148 | |||
|
Description
Simon Reber
2022-02-24 15:27:22 UTC
Hi all, Just adding some details about Leader Election that ma be useful. + https://sdk.operatorframework.io/docs/building-operators/golang/advanced-topics/#leader-election Thanks and all the best, Simon Reber would this change help
renewDeadline := 60 * time.Second // <--- here
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
MetricsBindAddress: metricsAddr,
Port: 9443,
HealthProbeBindAddress: probeAddr,
LeaderElection: enableLeaderElection,
LeaderElectionID: "39f5e5c3.nodefeaturediscoveries.nfd.kubernetes.io",
Namespace: watchNamespace,
RenewDeadline: &renewDeadline, // <--- here
})
?
(In reply to Carlos Eduardo Arango Gutierrez from comment #2) > would this change help > > renewDeadline := 60 * time.Second // <--- here > mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{ > Scheme: scheme, > MetricsBindAddress: metricsAddr, > Port: 9443, > HealthProbeBindAddress: probeAddr, > LeaderElection: enableLeaderElection, > LeaderElectionID: > "39f5e5c3.nodefeaturediscoveries.nfd.kubernetes.io", > Namespace: watchNamespace, > RenewDeadline: &renewDeadline, // <--- here > }) > > ? If that is for lease renewal, then I think this should improve the experience (please also evaluate the impact for the Operator if there is any with this change). Also, do you have any request timeout set for requests towards the OpenShift Container Platform 4 - API as this might need some tweaking as well since we may also see timeouts in this area. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.11.0 extras and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5070 |