Bug 1845331

Summary: Message collection size is too large for Zaqar
Product: Red Hat OpenStack Reporter: Brendan Shephard <bshephar>
Component: instack-undercloudAssignee: Adriano Petrich <apetrich>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: apetrich, jelle.hoylaerts.ext, jhoylaer, jschluet, mburns, ramishra, shtiwari, slinaber
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: instack-undercloud-8.4.9-10.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-28 18:23:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Brendan Shephard 2020-06-08 22:05:49 UTC
Description of problem:
After applying the patch from this BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1712278
https://review.opendev.org/#/c/680688/
https://review.opendev.org/#/c/663688/
https://access.redhat.com/errata/RHBA-2019:3794

We're still hitting this issue and it required us to increase the following:
sudo crudini --set /etc/zaqar/zaqar.conf transport max_messages_post_size 2097152
sudo crudini --set /etc/zaqar/zaqar.conf oslo_messaging_kafka producer_batch_size 32768
sudo crudini --set /etc/mistral/mistral.conf engine execution_field_size_limit_kb 32768

We hit this during the update converge step for a minor update:
https://bugzilla.redhat.com/show_bug.cgi?id=1712278#c11


Version-Release number of selected component (if applicable):
openstack-tripleo-common-8.7.1-15.el7ost.noarch.rpm 

How reproducible:
Difficult to reproduce. 

Possibly Mistral reporting messages from the ceph-ansible deployment and exceeding the message size?

Actual results:
The overcloud converge fails without much info to tell us why. Until we look in the Mistral logs and can see the error message:
ActionException: ZaqarAction.queue_post failed: Error response from Zaqar. Code: 400. Title: Invalid API request. Description: Message collection size is too large. Max size 1048576.


Expected results:
Either we avoid posting large messages to Zaqar, or increase the sizes here by default to cover such scenarios where the messages coming in might be quite large.

Additional info:

Comment 18 errata-xmlrpc 2020-10-28 18:23:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13.0 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4388