Bug 1498916
Summary: | [UPDATES] update on all nodes finishes but mistral fails to receive notification | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Lukas Bezdicka <lbezdick> |
Component: | openstack-tripleo-common | Assignee: | Marios Andreou <mandreou> |
Status: | CLOSED ERRATA | QA Contact: | Yurii Prokulevych <yprokule> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 12.0 (Pike) | CC: | augol, ccamacho, dmatthew, emacchi, jpichon, jschluet, lbezdick, mandreou, mbultel, mburns, sclewis, slinaber, therve, yprokule |
Target Milestone: | ga | Keywords: | Triaged |
Target Release: | 12.0 (Pike) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-common-7.6.3-4.el7ost.noarch | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-13 22:13:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Lukas Bezdicka
2017-10-05 14:34:44 UTC
Can you provide the Mistral logs for this? I'm having trouble tracking down the issue. It looks like the workflow is attempting to send a message to Zaqar that is larger than the allowed limit. From reading the tripleo.package_update.v1 workflow and the custom action I can't figure out where that would come from. I'm hoping that a traceback in the logs will provide more details The message size is already set by instack. The messages posted here is the result of the ansible/puppet upgrade run, it's about 1.2M, more than the 1M allowed. I suggest limiting the message, something like this: http://paste.openstack.org/show/625389/ in tripleo-common. That said, it's bad to have that much data transit in ansible/mistral. Long term, it'd be nice to either produce less logs, or push them to swift directly. There is also an unhealthy amount of warnings produced by the puppet run. o/ thanks Thomas - yeah agree the truncate is not ideal and have been holding off on posting the review to tripleo-common this morning hoping someone would come up with a better way. I haven't heard one so I'll post it in a moment anyway and we can take it from there. this is merged to pike so moving POST. Note that thankfully there is a better fix being tracked for https://bugzilla.redhat.com/show_bug.cgi?id=1505926 which will prevent these huge messages in the first place. openstack-tripleo-common-7.6.3-4.el7ost Verified with openstack-tripleo-common-7.6.3-8.el7ost.noarch tail oc-update-*log ==> oc-update-00-Controller.log <== u'TASK [debug] *******************************************************************', u'skipping: [192.168.24.20]', u'', u'PLAY RECAP *********************************************************************', u'192.168.24.15 : ok=112 changed=56 unreachable=0 failed=0 ', u'192.168.24.17 : ok=114 changed=56 unreachable=0 failed=0 ', u'192.168.24.20 : ok=112 changed=56 unreachable=0 failed=0 ', u''] ('Response is not a JSON object.', ValueError('No JSON object could be decoded',)) Success ==> oc-update-CephStorage.log <== u'TASK [debug] *******************************************************************', u'skipping: [192.168.24.18]', u'', u'PLAY RECAP *********************************************************************', u'192.168.24.14 : ok=56 changed=13 unreachable=0 failed=0 ', u'192.168.24.18 : ok=56 changed=13 unreachable=0 failed=0 ', u'192.168.24.9 : ok=56 changed=13 unreachable=0 failed=0 ', u''] ('Response is not a JSON object.', ValueError('No JSON object could be decoded',)) Success ==> oc-update-Compute.log <== u'', u'TASK [debug] *******************************************************************', u'skipping: [192.168.24.10]', u'', u'PLAY RECAP *********************************************************************', u'192.168.24.10 : ok=58 changed=13 unreachable=0 failed=0 ', u'192.168.24.12 : ok=58 changed=13 unreachable=0 failed=0 ', u''] ('Response is not a JSON object.', ValueError('No JSON object could be decoded',)) Success Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |