Bug 1475131

Summary: The "Dump logs from node service if it failed" tasks has wrong command
Product: OpenShift Container Platform Reporter: Wenkai Shi <weshi>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: Wenkai Shi <weshi>
Severity: low Docs Contact:
Priority: low    
Version: 3.6.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 05:32:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wenkai Shi 2017-07-26 06:45:00 UTC
Description of problem:
During installation, the atomic-openshift-node unable to start, the installer try to dump logs from node service, this also failed due to command is not right.

Version-Release number of the following components:
openshift-ansible-3.6.170.0-1.git.0.be93f81.el7

How reproducible:
100%

Steps to Reproduce:
1. Install OCP
2. Make sure atomic-openshift-node service unable to start during installation.
3.

Actual results:
# ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
...
TASK [openshift_node : Start and enable node] **********************************
Wednesday 26 July 2017  06:00:17 +0000 (0:00:00.554)       0:08:58.032 ******** 

FAILED - RETRYING: TASK: openshift_node : Start and enable node (1 retries left).

fatal: [master.example.com]: FAILED! => {
    "attempts": 1, 
    "changed": false, 
    "failed": true
}

MSG:

Unable to start service atomic-openshift-node: Job for atomic-openshift-node.service failed because the control process exited with error code. See "systemctl status atomic-openshift-node.service" and "journalctl -xe" for details.


...ignoring

TASK [openshift_node : Dump logs from node service if it failed] ***************
Wednesday 26 July 2017  06:00:58 +0000 (0:00:40.897)       0:09:38.930 ******** 

fatal: [master.example.com]: FAILED! => {
    "changed": true, 
    "cmd": [
        "journalctl", 
        "--no-pager", 
        "-n", 
        "100", 
        "atomic-openshift-node"
    ], 
    "delta": "0:00:00.003558", 
    "end": "2017-07-26 06:00:57.723560", 
    "failed": true, 
    "rc": 1, 
    "start": "2017-07-26 06:00:57.720002", 
    "warnings": []
}

STDERR:

Failed to add match 'atomic-openshift-node': Invalid argument
Failed to add filters: Invalid argument
...

Expected results:
The "Dump logs from node service if it failed" tasks succeed.

Additional info:
# cat ./roles/openshift_node/tasks/main.yml
...
- name: Dump logs from node service if it failed
  command: journalctl --no-pager -n 100 {{ openshift.common.service_type }}-node
  when: node_start_result | failed
...

Wrong: 
# journalctl --no-pager -n 100 atomic-openshift-node
Failed to add match 'atomic-openshift-node': Invalid argument
Failed to add filters: Invalid argument

Correct: 
# journalctl --no-pager -n 100 -u atomic-openshift-node
...

Comment 3 Wenkai Shi 2017-07-28 06:12:35 UTC
Verified with version openshift-ansible-3.6.172.0.0-1.git.0.d90ca2b.el7, it could works well now.

# grep -nir -A2 "Dump logs from" /usr/share/ansible/openshift-ansible/
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml:203:- name: Dump logs from master service if it failed
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-204-  command: journalctl --no-pager -n 100 -u {{ openshift.common.service_type }}-master
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-205-  when: start_result | failed
--
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml:240:- name: Dump logs from master-api if it failed
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-241-  command: journalctl --no-pager -n 100 -u {{ openshift.common.service_type }}-master-api
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-242-  when: start_result | failed
--
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml:263:- name: Dump logs from master-api if it failed
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-264-  command: journalctl --no-pager -n 100 -u {{ openshift.common.service_type }}-master-api
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-265-  when: start_result | failed
--
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml:303:- name: Dump logs from master-controllers if it failed
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-304-  command: journalctl --no-pager -n 100 -u {{ openshift.common.service_type }}-master-controllers
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-305-  when: start_result | failed
--
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml:323:- name: Dump logs from master-controllers if it failed
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-324-  command: journalctl --no-pager -n 100 -u {{ openshift.common.service_type }}-master-controllers
/usr/share/ansible/openshift-ansible/roles/openshift_master/tasks/main.yml-325-  when: start_result | failed
--
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/main.yml:232:- name: Dump logs from node service if it failed
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/main.yml-233-  command: journalctl --no-pager -n 100 -u {{ openshift.common.service_type }}-node
/usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/main.yml-234-  when: node_start_result | failed

Comment 5 errata-xmlrpc 2017-08-10 05:32:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716