Bug 1554748 - [3.6] OCP on Azure - if the kubelet can't reach the Azure API - marked as NotReady
Summary: [3.6] OCP on Azure - if the kubelet can't reach the Azure API - marked as Not...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.6.z
Assignee: Jan Chaloupka
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks: 1573122
TreeView+ depends on / blocked
 
Reported: 2018-03-13 09:52 UTC by Vladislav Walek
Modified: 2018-05-17 07:59 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1573122 (view as bug list)
Environment:
Last Closed: 2018-05-17 07:58:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1579 0 None None None 2018-05-17 07:59:26 UTC

Description Vladislav Walek 2018-03-13 09:52:23 UTC
Description of problem:

When the kubelet is not able to reach the azure API, it starts to send the status NotReady and stops the sending the status at all.
The idea is not to depend on the Azure API status, as the instance data are not changed. 
Node should not fail when it can't get the Azure API for first time.
Kubelet is checking the instance data every 10s. Every 10s the Azure API is called.

The kubelet could check the instance and work with current instance even if the Azure API is not reachable. Could check 10 times (100s) and if the Azure API is not reachable - then mark the node as NotReady with clear error message that Azure API is not reachable.

Version-Release number of selected component (if applicable):
OpenShift Container Platform 3.6, 3.7

How reproducible:
- create Azure environment
- block the azure API on local iptables
- check how fast the node becomes not ready

Steps to Reproduce:
1.
2.
3.

Actual results:
this is actually causing issue to customer. For some reason the Azure API is not reachable, but it causes that the node becomes NotReady and the pod eviction. happening on all nodes

Expected results:
the nodes will rely on the previous data and send status Ready for specified time frame (can be set in the node-config)

Additional info:

Comment 8 Vittorio 2018-03-21 22:56:12 UTC
Hi Jan/Vlad/Dan,
The azure sdk go code embedded in OpenShift apparently is now old and is unable to handle network issues such as TCP Resets.

There is newer version that performs retries.

Old code – Customer is using very old (7.0.1-beta)  – (line 208 shows no retries)

https://github.com/Azure/go-autorest/blob/d7c034a8af24eda120dd6460bfcd6d9ed14e43ca/autorest/sender.go#L208

New code – Retries implemented – (line 224)
https://github.com/Azure/go-autorest/blob/master/autorest/sender.go#L212

Swapping out this code is probably not so easy are your engineers able to roll this up into OpenShift 3.6.x

Thanks
Vittorio

Comment 22 DeShuai Ma 2018-04-26 09:58:10 UTC
Verified, the package have fixed the issue.
[root@dma-master-nfs-1 azure]# oc version
oc v3.6.173.0.96
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://dma-master-nfs-1:8443
openshift v3.6.173.0.96
kubernetes v1.6.1+5115d708d7


Steps to verify:
1. replace the package in cluster
2. In one terminal to watch node status
3. On node instance run command "iptables -A OUTPUT -d management.azure.com -j DROP" to block the access of azure api
4. On node watch the node log

Result: node always in Ready status, In node we can find the "Timeout after 10s" to request azure api every 20s:
//In master watch node status
[root@dma-master-nfs-1 azure]# oc get no dma-node-registry-router-1 -w
NAME                         STATUS    AGE       VERSION
dma-node-registry-router-1   Ready     16m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     16m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     16m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     17m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     17m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     17m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     18m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     18m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     18m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     19m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     19m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     19m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     20m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     20m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     20m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     21m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     21m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     21m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     22m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     22m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     22m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     23m       v1.6.1+5115d708d7

//watch node log
[root@dma-node-registry-router-1 azure]# iptables -A OUTPUT -d management.azure.com -j DROP
[root@dma-node-registry-router-1 azure]# journalctl -f -u atomic-openshift-node.service |grep "Timeout after"
Apr 26 09:41:27 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:41:27.451027   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 26 09:41:47 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:41:47.461652   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 26 09:42:07 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:42:07.472563   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 26 09:42:27 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:42:27.485010   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 26 09:42:47 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:42:47.495664   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 26 09:43:07 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:43:07.506392   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 26 09:43:27 dma-node-registry-router-1 atomic-openshift-node[51348]: W0426 09:43:27.517453   51348 kubelet_node_status.go:965] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s

Comment 23 DeShuai Ma 2018-04-26 13:08:46 UTC
Could you help move to ON_QA? I'll move to verified.

Comment 30 DeShuai Ma 2018-04-27 14:46:45 UTC
Test the package, it have fix the bug. Node always be Ready status even after blocker the access to azure api.

1. On master watch node status
[root@dma-master-nfs-1 ~]# oc version
oc v3.6.112
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://dma-master-nfs-1:8443
openshift v3.6.112
kubernetes v1.6.1+5115d708d7
[root@dma-master-nfs-1 ~]# oc get node
NAME                         STATUS    AGE       VERSION
dma-master-nfs-1             Ready     1m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     1m        v1.6.1+5115d708d7
[root@dma-master-nfs-1 ~]# oc get node dma-node-registry-router-1 -w
NAME                         STATUS    AGE       VERSION
dma-node-registry-router-1   Ready     1m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     3m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     3m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     3m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     4m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     4m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     4m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     5m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     5m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     5m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     6m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     6m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     6m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     7m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     7m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     7m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     8m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     8m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     8m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     9m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     9m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     9m        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     10m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     10m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     10m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     11m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     11m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     11m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     12m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     12m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     12m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     13m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     13m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     13m       v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     14m       v1.6.1+5115d708d7

2. On node, At the same time block azure api access then watch node log
[root@dma-node-registry-router-1 ~]# journalctl -f -u atomic-openshift-node.service |grep "Timeout after"
Apr 27 13:43:02 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:43:02.638975  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:43:22 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:43:22.650781  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:43:42 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:43:42.683550  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:44:02 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:44:02.764281  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:44:22 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:44:22.775688  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:44:42 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:44:42.787923  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:45:02 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:45:02.799603  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:45:38 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:45:38.034957  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:46:18 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:46:18.045537  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:46:58 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:46:58.056462  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:47:38 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:47:38.067782  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:48:18 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:48:18.080742  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:48:58 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:48:58.094538  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:49:38 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:49:38.113957  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:50:18 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:50:18.126280  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:50:58 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:50:58.137784  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:51:38 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:51:38.150437  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:52:18 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:52:18.162189  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:52:58 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:52:58.184645  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:53:38 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:53:38.199168  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 13:54:18 dma-node-registry-router-1 atomic-openshift-node[113502]: W0427 13:54:18.209722  113502 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s

Comment 32 Mike Fiedler 2018-04-27 22:34:31 UTC
Re-tested using the procedure in comment 22 using the build for 3.7.173.0.112 produced by @jchaloup

The result is passed.  Node stayed ready after access to management.azure.com was blocked.

[root@dma-master-nfs-1 ~]# oc get nodes -w | grep registry                                               
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7                                       
dma-node-registry-router-1   Ready     8h        v1.6.1+5115d708d7


[root@dma-node-registry-router-1 ~]#  iptables -A OUTPUT -d management.azure.com -j DROP    


Apr 27 22:16:51 dma-node-registry-router-1 atomic-openshift-node[1504]: I0427 22:16:51.708854    1504 kubelet_node_status.go:77] Attempting to register node dma-node-registry-router-1
Apr 27 22:16:51 dma-node-registry-router-1 atomic-openshift-node[1504]: I0427 22:16:51.772339    1504 kubelet_node_status.go:128] Node dma-node-registry-router-1 was previously registered
Apr 27 22:16:51 dma-node-registry-router-1 atomic-openshift-node[1504]: I0427 22:16:51.772371    1504 kubelet_node_status.go:80] Successfully registered node dma-node-registry-router-1
Apr 27 22:19:49 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:19:49.888750    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:20:09 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:20:09.902404    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:20:29 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:20:29.913912    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:20:49 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:20:49.928487    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:21:09 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:21:09.939919    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:21:29 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:21:29.951422    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:21:49 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:21:49.962785    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:22:25 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:22:25.277106    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:23:05 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:23:05.287969    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 27 22:23:45 dma-node-registry-router-1 atomic-openshift-node[1504]: W0427 22:23:45.301481    1504 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s


rpm -q atomic-openshift-node atomic-openshift                       
atomic-openshift-node-3.6.173.0.112-1.git.5.327be21.el7.x86_64                                           
atomic-openshift-3.6.173.0.112-1.git.5.327be21.el7.x86_64

Comment 33 Mike Fiedler 2018-04-27 22:39:12 UTC
No issues with router/registry readiness was seen.   https://bugzilla.redhat.com/show_bug.cgi?id=1572699 was not encountered

Comment 34 DeShuai Ma 2018-04-27 22:58:15 UTC
The version in comment 32 is Typo, should be 3.6.173.0.112 not 3.7.173.0.112
atomic-openshift-3.6.173.0.112-1.git.5.327be21.el7.x86_64

[root@dma-master-nfs-1 ~]# oc version
oc v3.6.173.0.112
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://dma-master-nfs-1:8443
openshift v3.6.173.0.112
kubernetes v1.6.1+5115d708d7

Comment 37 Mike Fiedler 2018-04-30 13:42:27 UTC
Re-tested using the procedure in comment 22 with 3.6.173.0.113 build provided by @jchaloup

The result is passed.  Node stayed ready after access to management.azure.com was blocked.

dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7    
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7              
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7    
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7                          
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7    
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7              
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7    
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7              
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7    
dma-master-nfs-1   Ready     2d        v1.6.1+5115d708d7              
dma-node-registry-router-1   Ready     2d        v1.6.1+5115d708d7

[root@dma-node-registry-router-1 package-3.6.173.0.113]# iptables -A OUTPUT -d management.azure.com -j DROP

8g") pod "dded6757-4a71-11e8-9ac4-000d3a948cd7" (UID: "dded6757-4a71-11e8-9ac4-000d3a948cd7").
Apr 30 13:30:57 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:30:57.569100   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:31:17 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:31:17.583085   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:31:37 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:31:37.595278   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:31:40 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:31:40.079146   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/c7ab0b91-4a73-11e8-9ac4-000d3a948cd7-default-token-60d5h" (spec.Name: "default-token-60d5h") pod "c7ab0b91-4a73-11e8-9ac4-000d3a948cd7" (UID: "c7ab0b91-4a73-11e8-9ac4-000d3a948cd7").
Apr 30 13:31:46 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:31:46.102034   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/57295593-4a86-11e8-9ac4-000d3a948cd7-router-token-jtk8f" (spec.Name: "router-token-jtk8f") pod "57295593-4a86-11e8-9ac4-000d3a948cd7" (UID: "57295593-4a86-11e8-9ac4-000d3a948cd7").
Apr 30 13:31:46 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:31:46.103367   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/57295593-4a86-11e8-9ac4-000d3a948cd7-server-certificate" (spec.Name: "server-certificate") pod "57295593-4a86-11e8-9ac4-000d3a948cd7" (UID: "57295593-4a86-11e8-9ac4-000d3a948cd7").
Apr 30 13:31:59 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:31:59.138713   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/dded6757-4a71-11e8-9ac4-000d3a948cd7-registry-token-h3n8g" (spec.Name: "registry-token-h3n8g") pod "dded6757-4a71-11e8-9ac4-000d3a948cd7" (UID: "dded6757-4a71-11e8-9ac4-000d3a948cd7").
Apr 30 13:32:07 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:32:07.618674   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:32:37 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:32:37.643502   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:32:57 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:32:57.123088   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/c7ab0b91-4a73-11e8-9ac4-000d3a948cd7-default-token-60d5h" (spec.Name: "default-token-60d5h") pod "c7ab0b91-4a73-11e8-9ac4-000d3a948cd7" (UID: "c7ab0b91-4a73-11e8-9ac4-000d3a948cd7").
Apr 30 13:33:01 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:33:01.131782   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/57295593-4a86-11e8-9ac4-000d3a948cd7-server-certificate" (spec.Name: "server-certificate") pod "57295593-4a86-11e8-9ac4-000d3a948cd7" (UID: "57295593-4a86-11e8-9ac4-000d3a948cd7").
Apr 30 13:33:01 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:33:01.132314   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/57295593-4a86-11e8-9ac4-000d3a948cd7-router-token-jtk8f" (spec.Name: "router-token-jtk8f") pod "57295593-4a86-11e8-9ac4-000d3a948cd7" (UID: "57295593-4a86-11e8-9ac4-000d3a948cd7").
Apr 30 13:33:07 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:33:07.667159   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:33:29 dma-node-registry-router-1 atomic-openshift-node[50519]: I0430 13:33:29.047330   50519 operation_generator.go:614] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/dded6757-4a71-11e8-9ac4-000d3a948cd7-registry-token-h3n8g" (spec.Name: "registry-token-h3n8g") pod "dded6757-4a71-11e8-9ac4-000d3a948cd7" (UID: "dded6757-4a71-11e8-9ac4-000d3a948cd7").
Apr 30 13:33:37 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:33:37.690546   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Apr 30 13:34:07 dma-node-registry-router-1 atomic-openshift-node[50519]: W0430 13:34:07.713974   50519 kubelet_node_status.go:963] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s

[root@dma-node-registry-router-1 package-3.6.173.0.113]# rpm -q atomic-openshift-node atomic-openshift                                       
atomic-openshift-node-3.6.173.0.113-1.git.5.fd65ec6.el7.x86_64        
atomic-openshift-3.6.173.0.113-1.git.5.fd65ec6.el7.x86_64

Comment 40 DeShuai Ma 2018-05-09 09:42:33 UTC
Verify on 3.6.173.0.117-1, detail step can ref comment 37

Comment 44 errata-xmlrpc 2018-05-17 07:58:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1579


Note You need to log in before you can comment on or make changes to this bug.