Bug 1991739
Summary: | WMCO ignores the `Deleting` phase notification event | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | jvaldes |
Component: | Windows Containers | Assignee: | jvaldes |
Status: | CLOSED ERRATA | QA Contact: | gaoshang <sgao> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.9 | CC: | aos-bugs, aravindh, rrasouli, team-winc |
Target Milestone: | --- | ||
Target Release: | 4.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-10-28 17:41:17 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1995341 |
Description
jvaldes
2021-08-09 20:48:42 UTC
@jvaldes, The steps to reproduce seems to indicate that this happens consistently. However you say this behavior is only seen occasionally? How can this be reproduced consistently? Indeed, this behavior happens occasionally. Ran several tests using from 1 to 3 replicas in the machineSet. Sometimes, the `windows-exporter` endpoint is not getting updated by WMCO, removing the information of the deleted machines.
> The steps to reproduce seems to indicate that this happens consistently
Open to suggestions to make that clear.
Marking as VERIFIED to allow the release-4.7/4.8 PRs to merge. Will move this back to ON_QA once that PR merges. Setting status back to ON_QA. oc describoc get node -l kubernetes.io/os=windows -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME winworker-g8fgw Ready worker 113m v1.22.0-rc.0.1611+9b1230e88478e6 172.31.249.177 172.31.249.177 Windows Server Standard 10.0.19041.508 docker://20.10.5 winworker-pm4kf Ready worker 107m v1.22.0-rc.0.1611+9b1230e88478e6 172.31.249.147 172.31.249.147 Windows Server Standard 10.0.19041.508 docker://20.10.5 [cloud-user@PSI-VM ~/windows-machine-config-operator]> oc describe endpoints -n openshift-windows-machine-config-operator Name: windows-exporter Namespace: openshift-windows-machine-config-operator Labels: name=windows-exporter Annotations: <none> Subsets: Addresses: 172.31.249.177,172.31.249.147 NotReadyAddresses: <none> Ports: Name Port Protocol ---- ---- -------- metrics 9182 TCP Events: <none> "version": "3.1.0+0a3a937" Bug tested mistakenly on 4.9 + 3.1.0 Testing on OCP 4.9 WMCO 4.0.0+2f0b49a2 he endpoint 10.0.128.191 has been deleted, yet still exist oc describe endpoints -n openshift-windows-machine-config-operator Name: windows-exporter Namespace: openshift-windows-machine-config-operator Labels: name=windows-exporter Annotations: <none> Subsets: Addresses: 10.0.128.191,10.0.140.91,10.0.143.88 NotReadyAddresses: <none> @rrasouli Can you share the WMCO logs associated with the failed test? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Windows Container Support for Red Hat OpenShift 4.0.0 product release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3702 |