Bug 2015772
| Summary: | Replacing private key reconcile 2 Windows nodes in parallel | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | gaoshang <sgao> | |
| Component: | Windows Containers | Assignee: | Mansi Kulkarni <mankulka> | |
| Status: | CLOSED ERRATA | QA Contact: | Ronnie Rasouli <rrasouli> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.9 | CC: | aos-bugs, mankulka | |
| Target Milestone: | --- | |||
| Target Release: | 4.10.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2017822 (view as bug list) | Environment: | ||
| Last Closed: | 2022-03-28 09:36:28 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2017822 | |||
Marking VERIFIED for release-4.9 PR to merge, will revert back. Since this bug has been verified on OCP 4.9 (Bug 2017822), marketed this bug as VERIFIED, thanks. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Windows Container Support for Red Hat OpenShift 5.0.0 [security update]), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0577 |
Description of problem: With 2 Windows nodes created by one machineset, replacing the private key will delete both the 2 Windows nodes at the beginning and then recreate them,it doesn’t follow maxUnhealthyCount rule, this will cause service breaking. $ oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator ... {"level":"info","ts":1634569171.675713,"logger":"controller.windowsmachine","msg":"deleting machine","machine":"openshift-machine-api/winworker-hwgqc"} {"level":"info","ts":1634569171.685051,"logger":"controller.secret","msg":"updating secret","secret":"openshift-windows-machine-config-operator/cloud-private-key","name":"windows-user-data"} {"level":"info","ts":1634569171.7818367,"logger":"controller.windowsmachine","msg":"unhealthy machine count for machineset","name":"winworker","total":2,"unhealthy":0} {"level":"info","ts":1634569171.7961621,"logger":"controller.windowsmachine","msg":"machine has been remediated by deletion","name":"winworker-hwgqc"} {"level":"info","ts":1634569171.7965193,"logger":"controller.windowsmachine","msg":"deleting machine","machine":"openshift-machine-api/winworker-nch44"} {"level":"info","ts":1634569171.8028684,"logger":"controller.windowsmachine","msg":"unhealthy machine count for machineset","name":"winworker","total":2,"unhealthy":0} {"level":"info","ts":1634569171.8165264,"logger":"controller.windowsmachine","msg":"machine has been remediated by deletion","name":"winworker-nch44"} {"level":"info","ts":1634569183.5262113,"logger":"metrics","msg":"Prometheus configured","endpoints":"windows-exporter","port":9182,"name":"metrics"} {"level":"info","ts":1634569186.2801213,"logger":"metrics","msg":"Prometheus configured","endpoints":"windows-exporter","port":9182,"name":"metrics"} {"level":"info","ts":1634569278.3233094,"logger":"controller.windowsmachine","msg":"processing","machine":"openshift-machine-api/winworker-r7nvc","address":"172.31.249.29"} ... Version-Release number of selected component (if applicable): OCP version: 4.9.0-0.nightly-2021-10-16-173626 WMCO version: 4.0.0+7991f6f0 How reproducible: Always Steps to Reproduce: 1, Scale up 2 Windows nodes by one machineset 2, Replace private key, e.g. change openshift-qe.pem to openshift-dev.pem 3, Check WMCO log Actual results: both the 2 Windows nodes are deleted at the beginning and recreated Expected results: The 2 Windows nodes should be deleted one by one following maxUnhealthyCount rule Additional info: