Bug 1730736
Summary: | [3.10] Atomic Host - Upgrade failed at Task: Wait for node to be ready | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Vikas Laad <vlaad> | ||||
Component: | Node | Assignee: | Seth Jennings <sjenning> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Sunil Choudhary <schoudha> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 3.10.0 | CC: | aos-bugs, dustymabe, jcallen, jialiu, jokerman, mmccomas, padillon, rkrawitz, sdodson, wmeng, wsun | ||||
Target Milestone: | --- | Keywords: | Regression, TestBlocker | ||||
Target Release: | 3.10.z | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1720978 | Environment: | |||||
Last Closed: | 2019-08-28 17:08:14 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1720978 | ||||||
Bug Blocks: | 1508040 | ||||||
Attachments: |
|
Comment 1
Russell Teague
2019-07-17 13:54:21 UTC
Fixed the ASB issue. Now I see the issue reported. Investigating. QE, there's suspicion that this may be related to a bug in the container runtime, does this problem exist in the latest versions of Atomic Host? (In reply to Scott Dodson from comment #16) > QE, there's suspicion that this may be related to a bug in the container > runtime, does this problem exist in the latest versions of Atomic Host? @wmeng, pls help have one more check on this. latest Atomic Host, meet this issue, too openshift-ansible-3.10.165-1.git.0.5ef95e3.el7 Red Hat Enterprise Linux Atomic Host 7.7.0 Linux 3.10.0-1062.el7.x86_64 docker-1.13.1-103.git7f2769b.el7.x86_64 when upgrade failed, # oc get nodes NAME STATUS ROLES AGE VERSION wmengug4ah770-master-etcd-zone1-1 Ready master 15h v1.10.0+b81c8f8 wmengug4ah770-master-etcd-zone2-1 Ready master 15h v1.10.0+b81c8f8 wmengug4ah770-master-etcd-zone2-2 Ready master 15h v1.10.0+b81c8f8 wmengug4ah770-node-zone1-primary-1 Ready compute 15h v1.9.1+a0ce1bc657 wmengug4ah770-node-zone2-primary-1 Ready compute 15h v1.9.1+a0ce1bc657 wmengug4ah770-nrriz-1 NotReady,SchedulingDisabled infra 15h v1.10.0+b81c8f8 wmengug4ah770-nrriz-2 Ready <none> 15h v1.9.1+a0ce1bc657 upgrade log: https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Run-Ansible-Playbooks-Nextge/470/consoleFull Created attachment 1608562 [details]
Node NotReady
atomic-openshift-node logs
ec2-23-20-104-227.compute-1.amazonaws.com
2349 Aug 26 21:52:42 ip-172-18-10-19.ec2.internal atomic-openshift-node[432]: I0826 21:52:42.130753 444 container_manager_linux.go:266] Creating device plugin manager: true 2350 Aug 26 21:52:42 ip-172-18-10-19.ec2.internal atomic-openshift-node[432]: I0826 21:52:42.130766 444 manager.go:102] Creating Device Plugin manager at /var/lib/kubelet/device-plugins/kubelet.sock [root@ip-172-18-10-19 ~]# ls -alh /var/lib/kubelet/device-plugins/ total 0 drwxr-xr-x. 2 root root 6 Aug 26 21:52 . drwxr-x---. 3 root root 28 Aug 26 21:52 .. This was root caused to be the same as Bug 1508040. The suggested work around is to reboot the affected node and restart the upgrade. |