Bug 1725002

Summary: [3.9] Install metrics failed at TASK [Mark node unschedulable] when cri-o enabled
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Junqi Zhao <juzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, gpei, jmartisk, jokerman, mmccomas, vrutkovs, wmeng
Version: 3.9.0Keywords: Regression
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: When updating metrics to include a node drain and kubelet restart used the incorrect fact variable Consequence: Playbook fails with l_kubelet_node_name Fix: Replace l_init_fact_hosts with "oo_nodes_to_config" Result: l_kubelet_node_name exists for all nodes to be configured.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-05 06:58:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
inventory file
none
ansible logs none

Description Junqi Zhao 2019-06-28 08:54:06 UTC
Created attachment 1585499 [details]
inventory file

Description of problem:
Install 3.9 metrics on 3.9 environment
cri-o://1.9.17-dev

Installation failed at
TASK [Mark node unschedulable] *************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/openshift-node/private/restart.yml:11
fatal: [ci-vm-10-0-149-170.hosted.upshift.rdu2.redhat.com]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'openshift' is undefined\n\nThe error appears to have been in '/usr/share/ansible/openshift-ansible/playbooks/openshift-node/private/restart.yml': line 11, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n  tasks:\n  - name: Mark node unschedulable\n    ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'openshift' is undefined"}

Actually the installation is not blocked, but the network diagram is empty on console UI in CRI-O env, still need to do workaround on every node

# systemctl restart atomic-openshift-node.service


Version-Release number of selected component (if applicable):
# rpm -qa | grep openshift-ansible
openshift-ansible-roles-3.9.85-1.git.0.7c950b1.el7.noarch
openshift-ansible-playbooks-3.9.85-1.git.0.7c950b1.el7.noarch
openshift-ansible-3.9.85-1.git.0.7c950b1.el7.noarch
openshift-ansible-docs-3.9.85-1.git.0.7c950b1.el7.noarch


How reproducible:
Always

Steps to Reproduce:
1. Deploy metrics 3.9
2.
3.

Actual results:
failed at TASK [Mark node unschedulable] when cri-o enabled 

Expected results:
should not have error


Additional info:
Attach inventory file

Comment 3 Vadim Rutkovsky 2019-06-28 12:46:04 UTC
Please provide ansible playbook log

Comment 5 Vadim Rutkovsky 2019-06-28 13:37:32 UTC
(In reply to Weihua Meng from comment #4)
> Related 
> https://bugzilla.redhat.com/show_bug.cgi?id=1720466
> https://github.com/openshift/openshift-ansible/pull/11699

Ah, thanks. Cherrypicked to 3.9 in https://github.com/openshift/openshift-ansible/pull/11730

Comment 7 Junqi Zhao 2019-07-01 01:00:59 UTC
Issue is fixed without installation error
# rpm -qa | grep openshift-ansible
openshift-ansible-3.9.86-1.git.0.84cc606.el7.noarch
openshift-ansible-roles-3.9.86-1.git.0.84cc606.el7.noarch
openshift-ansible-docs-3.9.86-1.git.0.84cc606.el7.noarch
openshift-ansible-playbooks-3.9.86-1.git.0.84cc606.el7.noarch

Comment 8 Junqi Zhao 2019-07-01 01:01:35 UTC
Created attachment 1586127 [details]
ansible logs

Comment 10 errata-xmlrpc 2019-07-05 06:58:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1642