Description of problem: Deploy logging with mux failed for maximum recursion depth exceeded in cmp. Version-Release number of selected component (if applicable): openshift-ansible-3.7.23 ansible 2.4.2.0 How reproducible: always Steps to Reproduce: 1. deploy logging with mux openshift_logging_mux_client_mode=maximal openshift_logging_use_mux=true openshift_logging_es_ops_pvc_storage_class_name=standard openshift_logging_es_ops_pvc_dynamic=true openshift_logging_use_ops=true openshift_logging_es_pvc_storage_class_name=standard openshift_logging_es_pvc_dynamic=true openshift_logging_es_memory_limit=1Gi openshift_logging_install_logging=true # TASK [openshift_logging_mux : Add mux namespaces] ***************************************************************************************************************************************************************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging_mux/tasks/main.yaml:216 ok: [host-8-242-21.host.centralci.eng.rdu2.redhat.com] => (item=mux-undefined) => {"changed": false, "item": "mux-undefined", "results": {"cmd": "/usr/local/bin/oc get namespace mux-undefined -o json", "results": {"apiVersion": "v1", "kind": "Namespace", "metadata": {"annotations": {"openshift.io/description": "", "openshift.io/display-name": "", "openshift.io/node-selector": "", "openshift.io/sa.scc.mcs": "s0:c16,c5", "openshift.io/sa.scc.supplemental-groups": "1000250000/10000", "openshift.io/sa.scc.uid-range": "1000250000/10000"}, "creationTimestamp": "2018-02-01T08:55:30Z", "name": "mux-undefined", "resourceVersion": "24064", "selfLink": "/api/v1/namespaces/mux-undefined", "uid": "a807d32a-072d-11e8-8f01-fa163eec0a3d"}, "spec": {"finalizers": ["openshift.io/origin", "kubernetes"]}, "status": {"phase": "Active"}}, "returncode": 0}, "state": "present"} TASK [openshift_logging_mux : Delete temp directory] ************************************************************************************************************************************************************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging_mux/tasks/main.yaml:223 ok: [host-8-242-21.host.centralci.eng.rdu2.redhat.com] => {"changed": false, "path": "/tmp/openshift-logging-ansible-BrTeid", "state": "absent"} TASK [openshift_logging : include_role] *************************************************************************************************************************************************************************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/install_logging.yaml:297 ERROR! Unexpected Exception, this is probably a bug: maximum recursion depth exceeded in cmp to see the full traceback, use -vvv Expected Result: Mux can be deployed with logging Additional info:
Hi @Anping, Interesting finding. I've tried myself using the upstream 3.7 branches (O-A-L & O-A), then I could reproduce the same error you reported although it occurred in openshift_logging_curator. I'd imagine it is not an issue of mux, but the way how to execute "include_role"? TASK [openshift_logging_curator : Delete temp directory] *********************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging_curator/tasks/main.yaml:123 Using module file /usr/lib/python2.7/site-packages/ansible/modules/files/file.py <localhost> ESTABLISH LOCAL CONNECTION FOR USER: origin <localhost> EXEC /bin/sh -c 'echo ~ && sleep 0' <localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654 `" && echo ansible-tmp-1517515481.35-137831890118654="` echo /home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654 `" ) && sleep 0' <localhost> PUT /tmp/tmpzVoKkD TO /home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654/file.py <localhost> EXEC /bin/sh -c 'chmod u+x /home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654/ /home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654/file.py && sleep 0' <localhost> EXEC /bin/sh -c 'sudo -H -S -n -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-tsuzboxpvwsnfjevtacblkiumkbehvcw; /usr/bin/python /home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654/file.py; rm -rf "/home/origin/.ansible/tmp/ansible-tmp-1517515481.35-137831890118654/" > /dev/null 2>&1'"'"' && sleep 0' [...] TASK [openshift_logging : include_role] **************************************** task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/install_logging.yaml:293 Traceback (most recent call last): File "/usr/lib64/python2.7/multiprocessing/queues.py", line 266, in _feed send(obj) RuntimeError: maximum recursion depth exceeded while calling a Python object
Well, it was too early to mention "it may not be mux" since install_logging.yaml:293 points this: roles/openshift_logging/tasks/install_logging.yaml 292 ## Mux 293 - include_role: 294 name: openshift_logging_mux 295 vars: 296 generated_certs_dir: "{{openshift.common.config_base}}/logging" 297 openshift_logging_mux_ops_host: "{{ ( openshift_logging_use_ops | bool ) | ternary('logging-es-ops', 'logging-es') }}" 298 openshift_logging_mux_namespace: "{{ openshift_logging_namespace }}" 299 openshift_logging_mux_master_url: "{{ openshift_logging_master_url }}" 300 openshift_logging_mux_image_pull_secret: "{{ openshift_logging_image_pull_secret }}" 301 when: 302 - openshift_logging_use_mux | bool
There was as bug in the openshift-ansible 3.7 branch around the include_role directive - let me see if I can find it
I think it is related to https://github.com/openshift/openshift-ansible/pull/6724
Thank you, Rich! I see this commit introduced "static: true" commit 0ee90f016f33fa18df7df4d73a251c7e8618e7de (origin/pr/6613) [release-3.7] Migrate to static: true for include_role And the pr6724 is going to revert it. The ansible doc [1] says: Note Handlers are made available to the whole play. Before 2.4, as with include, this task could be static or dynamic, If static it implied that it won’t need templating nor loops nor conditionals and will show included tasks in the –list options. Ansible would try to autodetect what is needed, but you can set static to yes or no at task level to control this. After 2.4, you can use import_role for ‘static’ behaviour and this action for ‘dynamic’ one. And I noticed "include_role" was replaced with "import_role" in the master branch. But there's no problem in installing mux with ops using the master branches. Does this imply import_role is not necessarily equivalent to static include_role? (sorry, i'm a bit confused... :) [1] http://docs.ansible.com/ansible/latest/include_role_module.html
I'm not sure - with openshift-ansible 3.8 and later they moved to support ansible 2.4 but still had to support ansible 2.3 with openshift-ansible 3.7 - I'm not sure how they resolved this for openshift-ansible 3.7 but at least now our logging CI is using a version of openshift-ansible that does not have this problem - our builds/installs/tests pass on release-3.7 branch. So it may be that the problem was fixed at some point in openshift-ansible 3.7, but the appropriate upstream/downstream openshift-ansible 3.7 packages have not yet been built and released.
Good news! > our builds/installs/tests pass on release-3.7 branch. Could you tell us which repo/release-3.7 branch includes the fix? Or maybe where we should keep eye on?
Note: I've verified O-A release-3.7 with pr/6724 solves the error -- RuntimeError: maximum recursion depth exceeded while calling a Python object
@Noriko The PR is in v3.7.25 and later. Test pass when use openshift-ansible:v3.7.26. Could you move bug to ON_QA?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0636