Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1573122 - [3.7] OCP on Azure - if the kubelet can't reach the Azure API - marked as NotReady
[3.7] OCP on Azure - if the kubelet can't reach the Azure API - marked as Not...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.7.0
x86_64 Linux
urgent Severity urgent
: ---
: 3.7.z
Assigned To: Seth Jennings
DeShuai Ma
:
Depends On: 1554748
Blocks:
  Show dependency treegraph
 
Reported: 2018-04-30 04:51 EDT by Paul Dwyer
Modified: 2018-05-17 23:55 EDT (History)
22 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Fixes and issue where a Node can stop reporting status if the connection to the Azure API is terminated uncleanly, resulting a long timeout before the connection is re-established and blocking the status update loop.
Story Points: ---
Clone Of: 1554748
Environment:
Last Closed: 2018-05-17 23:54:46 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1576 None None None 2018-05-17 23:55 EDT

  None (edit)
Comment 19 DeShuai Ma 2018-05-10 04:48:54 EDT
Verify on v3.7.46

[root@dma37-master-etcd-nfs-1 ~]# oc version
oc v3.7.46
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://dma37-master-etcd-nfs-1:8443
openshift v3.7.46
kubernetes v1.7.6+a08f5eeb62

1. Watch node status in master
[root@dma37-master-etcd-nfs-1 ~]# oc get no dma37-node-registry-router-1 -w
NAME                           STATUS    AGE       VERSION
dma37-node-registry-router-1   Ready     19m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     19m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     19m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     20m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     20m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     20m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     21m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     21m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     21m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     22m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     22m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     22m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     23m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     23m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     23m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     24m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     24m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     24m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     25m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     25m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     25m       v1.7.6+a08f5eeb62
dma37-node-registry-router-1   Ready     26m       v1.7.6+a08f5eeb62


2. No node block the connection with azure api then watch node log
[root@dma37-node-registry-router-1 ~]#  iptables -A OUTPUT -d management.azure.com -j DROP   
[root@dma37-node-registry-router-1 ~]# journalctl -f -u atomic-openshift-node.service |grep "Timeout after"
May 10 08:32:54 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:32:54.204527   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:33:14 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:33:14.225115   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:33:34 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:33:34.249522   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:33:54 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:33:54.274028   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:34:14 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:34:14.293589   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:34:34 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:34:34.318474   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:34:54 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:34:54.341656   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:35:26 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:35:26.402908   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:36:06 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:36:06.424409   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:36:46 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:36:46.454712   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:37:26 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:37:26.486660   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:38:06 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:38:06.552450   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
May 10 08:38:46 dma37-node-registry-router-1 atomic-openshift-node[29230]: W0510 08:38:46.572796   29230 kubelet_node_status.go:1007] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Comment 22 errata-xmlrpc 2018-05-17 23:54:46 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1576

Note You need to log in before you can comment on or make changes to this bug.