Bug 1583152 - [3.6]upgrade failed at TASK [Drain Node for Kubelet upgrade]
Summary: [3.6]upgrade failed at TASK [Drain Node for Kubelet upgrade]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 3.6.z
Assignee: Vadim Rutkovsky
QA Contact: Weihua Meng
URL:
Whiteboard:
Depends On: 1583151
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-28 10:34 UTC by Weihua Meng
Modified: 2018-06-28 07:55 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1583151
Environment:
Last Closed: 2018-06-28 07:54:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2007 0 None None None 2018-06-28 07:55:17 UTC

Description Weihua Meng 2018-05-28 10:34:46 UTC
Upgrade to OCP 3.6 has same issue.

+++ This bug was initially created as a clone of Bug #1583151 +++

Description of problem:
[3.7]upgrade failed at TASK [Drain Node for Kubelet upgrade]
error is caused by https://github.com/openshift/openshift-ansible/blob/release-3.6/playbooks/common/openshift-cluster/upgrades/upgrade_control_plane.yml#L386

introduced by 
https://github.com/openshift/openshift-ansible/commit/ba74ec43b9e4d743f62b89ad4f316a45e7fd09c9#diff-6e1f944c172a66b8294fa8cc2b081a97

Version-Release number of the following components:
openshift-ansible-3.7.51-1.git.0.f9b681c.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. upgrade OCP 3.6 to 3.7

Actual results:
TASK [Drain Node for Kubelet upgrade] ******************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/upgrade_control_plane.yml:408
FAILED - RETRYING: Drain Node for Kubelet upgrade (1 retries left).
fatal: [host-8-246-90.host.centralci.eng.rdu2.redhat.com -> host-8-246-90.host.centralci.eng.rdu2.redhat.com]: FAILED! => {"attempts": 1, "changed": true, "cmd": ["oadm", "adm", "drain", "host-8-246-90.host.centralci.eng.rdu2.redhat.com", "--config=/etc/origin/master/admin.kubeconfig", "--force", "--delete-local-data", "--ignore-daemonsets", "--timeout=0s"], "delta": "0:00:00.241473", "end": "2018-05-28 05:12:14.862622", "failed": true, "failed_when_result": true, "rc": 1, "start": "2018-05-28 05:12:14.621149", "stderr": "Error: unknown command \"adm\" for \"oadm\"\nRun 'oadm --help' for usage.", "stderr_lines": ["Error: unknown command \"adm\" for \"oadm\"", "Run 'oadm --help' for usage."], "stdout": "", "stdout_lines": []}

Expected results:
Upgrade succeeds.

Comment 1 Vadim Rutkovsky 2018-05-28 11:42:56 UTC
Created https://github.com/openshift/openshift-ansible/pull/8548

Comment 2 Vadim Rutkovsky 2018-05-30 14:21:14 UTC
Fix is available in openshift-ansible-3.6.173.0.122-1

Comment 3 Weihua Meng 2018-05-31 06:22:11 UTC
PR not in openshift-ansible-3.6.173.0.122-1.git.0.4b56b4f.el7.noarch

  - name: Drain Node for Kubelet upgrade
    command: >
      {{ hostvars[groups.oo_first_master.0].openshift.common.admin_binary }} adm drain {{ openshift.node.nodename | lower }}

Waiting for next 3.6 build.

Comment 4 Weihua Meng 2018-06-03 22:14:31 UTC
Fixed.
openshift-ansible-3.6.173.0.123-1.git.0.115034e.el7.noarch

TASK [Drain Node for Kubelet upgrade] ******************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/upgrade_control_plane.yml:384

changed: [qe-wmengah35haetcd-master-etcd-zone1-1.0603-xji.qe.rhcloud.com -> qe-wmengah35haetcd-master-etcd-zone1-1.0603-xji.qe.rhcloud.com] => {
    "attempts": 1,
    "changed": true,
    "cmd": [
        "/usr/local/bin/oadm",
        "drain",
        "qe-wmengah35haetcd-master-etcd-zone1-1",
        "--config=/etc/origin/master/admin.kubeconfig",
        "--force",
        "--delete-local-data",
        "--ignore-daemonsets",
        "--timeout=0s"
    ],
    "delta": "0:00:00.323220",
    "end": "2018-06-03 15:23:54.009903",
    "failed": false,
    "failed_when_result": false,
    "invocation": {
        "module_args": {
            "_raw_params": "/usr/local/bin/oadm drain qe-wmengah35haetcd-master-etcd-zone1-1 --config=/etc/origin/master/admin.kubeconfig --force --delete-local-data --ignore-daemonsets --timeout=0s",

Kernel Version: 3.10.0-862.2.3.el7.x86_64
Operating System: Red Hat Enterprise Linux Atomic Host 7.5.1

Comment 6 errata-xmlrpc 2018-06-28 07:54:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2007


Note You need to log in before you can comment on or make changes to this bug.