Bug 962557 - Issue with 65k message limit in qpid
Issue with 65k message limit in qpid
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
3.0
Unspecified Unspecified
urgent Severity urgent
: async
: 4.0
Assigned To: Russell Bryant
Attila Fazekas
: OtherQA
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-13 16:59 EDT by Mark McLoughlin
Modified: 2016-04-27 00:48 EDT (History)
6 users (show)

See Also:
Fixed In Version: openstack-nova-2013.2-0.21.b3.el6ost
Doc Type: Bug Fix
Doc Text:
Under certain conditions, it was possible for Compute to send a QPID message that was larger than the original maximum size. This would have resulted in the failure of the qpid message and its corresponding operation. This update removes the size limit in qpid message encoding. Consequently, operations that previously failed due to qpid message size limit will now succeed.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-12-19 19:02:40 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1175808 None None None Never

  None (edit)
Description Mark McLoughlin 2013-05-13 16:59:30 EDT
See:

  https://bugs.launchpad.net/nova/+bug/1175808

  Qpid has a limitation where it cannot serialize a Python dict
  containing a string longer than 65535 characters. This can result
  in problems when making a conductor call that returns a large
  structure - for example, instance_get_all_by_host on one of my
  systems returns 38 instances, which when serialized as JSON is
  too long for Qpid to handle.

Sounds like an issue (a) only seen at scale and (b) specific to nova-conductor and, therefore, Grizzly
Comment 3 Russell Bryant 2013-05-30 15:30:10 EDT
This isn't as bad as it looks at first.  It doesn't really affect grizzly, unless it's receiving a message sent by havana.  The fix for havana will likely require a grizzly change to make sure grizzly can still understand havana messages, though.

https://review.openstack.org/#/c/28711/
https://code.launchpad.net/bugs/1175808
Comment 6 Russell Bryant 2013-07-23 16:40:00 EDT
Fixed upstream in havana:

commit 781a8f908cd3e5e69ff8b88d998fa93c48532e15
Author: Andrew Laski <andrew.laski@rackspace.com>
Date:   Wed Jun 5 10:02:07 2013 -0400

    Update rpc/impl_qpid.py from oslo
    
    The current qpid driver cannot serialize objects containing strings
    longer than 65535 characters.  This just became a breaking issue when
    the message to scheduler_run_instance went over that limit.  The fix has
    been commited to oslo, so this just syncs it over to Nova.
    
    Bug 1175808
    Bug 1187595
    
    Change-Id: If95c11a7e03c81d89133f6cad0dcbb6d8acb8148
Comment 9 Attila Fazekas 2013-12-02 03:32:41 EST
I was able to boot 120+ instance on the same hypervisor  one by one, without an ERROR, and without any suspicious log message.

several related command:
nova quota-update --cores -1 $TENANT
nova quota-update --ram -1 $TENANT
nova quota-update --instances -1 $TENANT
a=1; while nova boot server-$a --image cirros-0.3.1-x86_64-uec --flavor 42  --poll ;do  a=$((a+1)); done 

packages:
openstack-nova-conductor-2013.2-5.el6ost.noarch
openstack-nova-scheduler-2013.2-5.el6ost.noarch

The new oslo rpc code is in the python-nova package.
Comment 12 errata-xmlrpc 2013-12-19 19:02:40 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2013-1859.html

Note You need to log in before you can comment on or make changes to this bug.