Bug 1591957
| Summary: | node upgrade failed - node service unit stuck in "activating" state | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||
| Component: | Cluster Version Operator | Assignee: | Scott Dodson <sdodson> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Gaoyun Pei <gpei> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.10.0 | CC: | aos-bugs, jokerman, mmccomas | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.10.z | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-08-02 20:10:27 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 1452065 [details] ansible, pod and system logs. inventory. Description of problem: 1. 3.9.27 HA cluster: 1 lb/3 master/etcd co-located/2 infra/5 nodes 2. Successfully ran upgrade_control_plane.yml. Master/etcd pods healthy and all nodes still Ready at 3.9 level 3. Ran upgrade_nodes.yml 1 node successfully upgraded to 3.10 3 nodes Ready at 3.9 level 1 node NotReady and not schedulable at 3.9. systemd unit for node service: ● atomic-openshift-node.service - OpenShift Node Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled) Active: activating (start) since Fri 2018-06-15 20:29:26 UTC; 2min 24s ago Last messages in the node log: Jun 15 20:29:27 ip-172-31-24-3.us-west-2.compute.internal atomic-openshift-node[126627]: I0615 20:29:27.259195 126627 server.go:739] cloud provider determined current node name to be ip-172-31-24-3...e.internal Jun 15 20:29:27 ip-172-31-24-3.us-west-2.compute.internal atomic-openshift-node[126627]: I0615 20:29:27.259289 126627 bootstrap.go:53] Using bootstrap kubeconfig to generate TLS client cert, key an...onfig file Jun 15 20:29:27 ip-172-31-24-3.us-west-2.compute.internal atomic-openshift-node[126627]: I0615 20:29:27.260999 126627 bootstrap.go:79] No valid private key found for bootstrapping, creating a new one Gathered the following (see attachment): ansible -vvv log journal from the failed node /etc/origin tar from the failed node node logs for master api and controllers inventory Version-Release number of the following components: # ansible --version ansible 2.4.3.0 config file = /etc/ansible/ansible.cfg configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, Apr 19 2018, 05:40:55) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] How reproducible: Unknown Steps to Reproduce: 1. See above