Bug 1632863

Summary: [3.10] upgrade failed at TASK [openshift_node : Approve node certificates when bootstrapping]
Product: OpenShift Container Platform Reporter: Brenton Leanhardt <bleanhar>
Component: Cluster Version OperatorAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: liujia <jiajliu>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, jokerman, mgugino, mmccomas, wmeng, wsun
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
In certain situations the upgrade process introduced a dns outage which prevented the upgrade from completing as expected. Now dnsmasq is restarted while the node is drained in order to prevent the issue.
Story Points: ---
Clone Of: 1629700 Environment:
Last Closed: 2018-11-11 16:39:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1629700    
Bug Blocks:    

Comment 1 Brenton Leanhardt 2018-09-25 17:45:22 UTC
https://github.com/openshift/openshift-ansible/pull/10116

Comment 3 liujia 2018-10-15 07:19:30 UTC
Version:
openshift-ansible-3.10.53-1.git.0.ba2c2ec.el7.noarch

Checked pr#10116 merged.

Steps:
1. Install ocp v3.10.45(latest release version) on rhel hosts with openshift ansible v3.10.47(latest release version)
[root@ip-172-18-0-246 ~]# oc version
oc v3.10.45
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-0-246.ec2.internal:8443
openshift v3.10.45
kubernetes v1.10.0+b81c8f8

2.Enable pre-release version(v3.10.51) repo on all hosts(latest v3.10.53 image was not available), and run upgrade against above cluster to latest v3.10.z with latest pre-release openshift-ansible(v3.10.53). 

3. Upgrade succeed.
[root@ip-172-18-0-246 ~]# oc version
oc v3.10.51
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-0-246.ec2.internal:8443
openshift v3.10.51
kubernetes v1.10.0+b81c8f8

[root@ip-172-18-0-246 ~]# oc get csr
NAME                                                   AGE       REQUESTOR                                                 CONDITION
csr-2nvp7                                              29m       system:node:ip-172-18-0-246.ec2.internal                  Approved,Issued
csr-5wfpq                                              10m       system:node:ip-172-18-15-123.ec2.internal                 Approved,Issued
csr-kcbmr                                              12m       system:node:ip-172-18-15-214.ec2.internal                 Approved,Issued
node-csr-7N51_MNOZ5eqcKMW1u33OFENk4pN6lrMl1ugA2586-0   12m       system:serviceaccount:openshift-infra:node-bootstrapper   Approved,Issued
node-csr-LZ98zvBFCS_eJ3rRqRErxDccgWOGiCjpVazCt4vgyPs   29m       system:admin                                              Approved,Issued
node-csr-WEYra_nGctW4vCFu6PQMaf7Du0jNFltCxCtqrCM6MF0   10m       system:serviceaccount:openshift-infra:node-bootstrapper   Approved,Issued

Comment 5 errata-xmlrpc 2018-11-11 16:39:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2709