Bug 1810531

Summary:	Machine controller incorrectly printed waiting seconds during node drain
Product:	OpenShift Container Platform	Reporter:	Alexander Demicev <ademicev>
Component:	Cloud Compute	Assignee:	Alexander Demicev <ademicev>
Cloud Compute sub component:	Other Providers	QA Contact:	Milind Yadav <miyadav>
Status:	CLOSED CURRENTRELEASE	Docs Contact:
Severity:	low
Priority:	low	CC:	agarcial, jhou, mgugino
Version:	4.4
Target Milestone:	---
Target Release:	4.4.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1804635	Environment:
Last Closed:	2020-08-21 12:23:24 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1804635
Bug Blocks:

Comment 4 Jianwei Hou 2020-03-09 12:56:43 UTC

This is not fixed.

1 controller.go:350] "qe-jho-cs52j-w-a-xcv8h": Node "qe-jho-cs52j-w-a-xcv8h.c.openshift-qe.internal" is unreachable, draining will wait 'Ĭ' seconds after pod is signalled for deletion and skip after it

Steps:
`oc get deployment machine-api-operator -n openshift-machine-api -o yaml`, find the image is quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4ea2b3f652e50f2eb433df3f9532e0dce3f3161c22a734c1ef01dcdce846f829

`oc image info quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4ea2b3f652e50f2eb433df3f9532e0dce3f3161c22a734c1ef01dcdce846f829`, find source git commit is SOURCE_GIT_COMMIT=e7a0c09ad611468b93b07e54d9984e6ad3575a61

git show e7a0c09ad611468b93b07e54d9984e6ad3575a61

commit e7a0c09ad611468b93b07e54d9984e6ad3575a61 (origin/release-4.4)
Merge: 89c8787e e5dda6a8
Author: OpenShift Merge Robot <openshift-merge-robot.github.com>
Date:   Fri Mar 6 16:52:01 2020 +0100

    Merge pull request #504 from openshift-cherrypick-robot/cherry-pick-495-to-release-4.4

    [release-4.4] Bug 1810531: Fix timeout formatting

According to the above, the image already contains the fix, but this is still reproducible.

Comment 6 Michael Gugino 2020-05-19 00:29:31 UTC

This code is ready to merge but waiting on cherry-pick approval and/or merge window opening for 4.4

Comment 10 sunzhaohua 2020-06-29 03:04:30 UTC

Verified on azure
clusterversion: 4.4.0-0.nightly-2020-06-27-171816
I0629 03:03:22.709679       1 controller.go:361] "zhsun62944azure-sthpx-worker-eastus22-29xk7": Node "zhsun62944azure-sthpx-worker-eastus22-29xk7" is unreachable, draining will wait 300 seconds after pod is signalled for deletion and skip after it

Comment 11 sunzhaohua 2020-06-29 03:07:32 UTC

This is still not fixed on aws

I0629 01:32:53.808462       1 controller.go:361] "zhsun62944-fvwvq-worker-us-east-2c-5nx59": Node "ip-10-0-200-252.us-east-2.compute.internal" is unreachable, draining will wait 'Ĭ' seconds after pod is signalled for deletion and skip after it

Comment 15 sunzhaohua 2020-07-10 06:36:35 UTC

Verified on aws, osp.


This is still not fixed on gcp.
https://github.com/openshift/cluster-api-provider-gcp/blob/release-4.4/vendor/github.com/openshift/machine-api-operator/pkg/controller/machine/controller.go

I0710 06:20:40.383450       1 controller.go:350] "zhsun710gcp44-qlfr9-worker-c-mrjkl": Node "zhsun710gcp44-qlfr9-worker-c-mrjkl.c.openshift-qe.internal" is unreachable, draining will wait 'Ĭ' seconds after pod is signalled for deletion and skip after it

Comment 18 Joel Speed 2020-08-20 11:57:29 UTC

This is low priority and cosmetic change, we are waiting on the patch manager to decide whether to include this or not. Nothing we can do right now. If it's not merged by the end of next sprint I suggest we close the issue.

Comment 19 Joel Speed 2020-08-21 12:23:24 UTC

This isn't worth backporting at this point since this is mostly a cosmetic change.