Bug 1531960

Summary: Fail to update EFK: 'namespace'
Product: OpenShift Container Platform Reporter: Bruno Andrade <bandrade>
Component: LoggingAssignee: Luke Meyer <lmeyer>
Status: CLOSED CURRENTRELEASE QA Contact: Anping Li <anli>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: aos-bugs, knakayam, rmeggins, smunilla
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
: 1559669 1559670 1559671 (view as bug list) Environment:
Last Closed: 2018-06-18 18:09:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1559669, 1559670, 1559671    
Attachments:
Description Flags
inventory none

Description Bruno Andrade 2018-01-06 21:52:09 UTC
Created attachment 1377971 [details]
inventory

Description of problem:

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml -vvv | tee -a ansible-$(date +%Y%m%d-%H%M).log


Ansible fails in the task below:

TASK [openshift_logging : Gather OpenShift Logging Facts] ***********************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/install_logging.yaml:2

The full traceback is:
  File "/tmp/ansible_Alxr7T/ansible_module_openshift_logging_facts.py", line 336, in main
    ansible_facts={"openshift_logging_facts": cmd.build_facts()}
  File "/tmp/ansible_Alxr7T/ansible_module_openshift_logging_facts.py", line 314, in build_facts
    self.facts_for_clusterrolebindings(self.namespace)
  File "/tmp/ansible_Alxr7T/ansible_module_openshift_logging_facts.py", line 270, in facts_for_clusterrolebindings
    if comp is not None and namespace == item["namespace"]:
fatal: [itsrv1554.esrv.local]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "invocation": {
        "module_args": {
            "admin_kubeconfig": "/tmp/openshift-logging-ansible-8HwJVw/admin.kubeconfig", 
            "oc_bin": "oc", 
            "openshift_logging_namespace": "logging"
        }
    }
}

MSG:

'namespace'

Ansible fails in the function below:


https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_logging/library/openshift_logging_facts.py#L278-L280

This is the call parameters for ansible task, so right parameters passed to the function
{"oc_bin": "oc", "admin_kubeconfig": "/tmp/openshift-logging-ansible-8HwJVw/admin.kubeconfig", "openshift_logging_namespace": "logging"}

Running oc get clusterrolebindings/cluster-readers -o yaml return the relevant line so seem like it should work.

oc get clusterrolebindings/cluster-readers -o yaml
apiVersion: v1
groupNames:
[.....]
subjects:
[....]
- kind: ServiceAccount
  name: aggregated-logging-fluentd
  namespace: logging
[....]

We also checked for ansible version if there are known issue but I could not find anything.


Some notes: 
User can logging on the cluster with the KUBECONFIG without problems:
oc login -u system:admin --config=/etc/origin/master/admin.kubeconfig

oc get nodes --show-labels
[[
NAME                   STATUS                     AGE       LABELS
itsrv1554.esrv.local   Ready,SchedulingDisabled   1y        kubernetes.io/hostname=itsrv1554.esrv.local,logging-infra-fluentd=true,region=infra,zone=default
itsrv1555.esrv.local   Ready,SchedulingDisabled   1y        kubernetes.io/hostname=itsrv1555.esrv.local,logging-infra-fluentd=true,region=infra,zone=default
itsrv1561.esrv.local   Ready                      1y        customer=shared,environment=shared,kubernetes.io/hostname=itsrv1561.esrv.local,logging-infra-fluentd=true,region=primary,zone=east
[....]

The inventory file and the full verbose log is attached. We're waiting from customer to provide the following information:

$ oc version
$ rpm -qa | grep openshift-ansible
$ rpm -qa | grep atomic-openshift-utils





Version-Release number of selected component (if applicable):
3.5

How reproducible:

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml -vvv | tee -a ansible-$(date +%Y%m%d-%H%M).log


Actual results:
Playbook failing on task : openshift_logging : Gather OpenShift Logging Facts

Expected results:
Playbook should run without problems


Additional info:

Comment 4 Kenjiro Nakayama 2018-01-13 05:07:32 UTC
Fixed in https://github.com/openshift/openshift-ansible/pull/6638

Comment 5 Jeff Cantrill 2018-03-22 02:29:33 UTC
Luke,

Reassigning to you as I dont know what the target release should be.  3.9?  Does it need to be backported?

Comment 6 Luke Meyer 2018-03-23 01:13:18 UTC
I didn't notice there was a bug for it at the time. So, it's been fixed in 3.9. It ought to be trivial enough to backport. It probably makes the most sense to have this bug track it for 3.9 and clone bugs for the other versions. The problem was observed in 3.5 ... I'm guessing it's probably not relevant before that (but if it is, someone can clone the bug for 3.4).

Comment 7 Anping Li 2018-03-23 05:37:51 UTC
No such issue in v3.9. So move bug to verified.