Bug 1451415

Summary: rhosp-director: Repeating errors: "ERROR oslo_messaging._drivers.amqpdriver" - fill up the file system.
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: python-kombuAssignee: Matthias Runge <mrunge>
Status: CLOSED ERRATA QA Contact: Alexander Chuzhoy <sasha>
Severity: high Docs Contact:
Priority: high    
Version: 12.0 (Pike)CC: ahrechan, apevec, dbecker, emacchi, jeckersb, lhh, mburns, mcornea, michele, morazi, oblaut, ohochman, rhel-osp-director-maint, scohen, srevivo, tvignaud
Target Milestone: Upstream M2Keywords: Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-kombu-4.0.2-5.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-13 21:27:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexander Chuzhoy 2017-05-16 15:16:35 UTC
rhosp-director: Repeating errors: "ERROR oslo_messaging._drivers.amqpdriver" - fill up the file system.

Environment:
instack-undercloud-7.0.0-0.20170503001109.el7ost.noarch


The file system was filled up in a matter of few hours:
[root@undercloud-0 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       100G  100G   20K 100% /
devtmpfs        7.8G     0  7.8G   0% /dev
tmpfs           7.8G  4.0K  7.8G   1% /dev/shm
tmpfs           7.8G   33M  7.8G   1% /run
tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
tmpfs           1.6G     0  1.6G   0% /run/user/0

The majority of space was used by neutron/nova logs:
59G     neutron
25G     nova

/var/log/neutron
26G     dhcp-agent.log
9.9G    dhcp-agent.log-20170516
25G     openvswitch-agent.log


Looking inside dhcp-agent.log - see repeating errors:

2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver Traceback (most recent call last):
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 332, in poll
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver     self.conn.consume(timeout=current_timeout)
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 1088, in consume
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver     error_callback=_error_callback)
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 810, in ensure
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver     except kombu.exceptions.OperationalError as exc:
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver AttributeError: 'module' object has no attribute 'OperationalError'
2017-05-15 19:20:08.336 2382 ERROR oslo_messaging._drivers.amqpdriver 
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver [-] Failed to process incoming message, retrying...
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver Traceback (most recent call last):
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 332, in poll
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver     self.conn.consume(timeout=current_timeout)
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 1088, in consume
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver     error_callback=_error_callback)
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 810, in ensure
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver     except kombu.exceptions.OperationalError as exc:
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver AttributeError: 'module' object has no attribute 'OperationalError'
2017-05-15 19:20:08.338 2382 ERROR oslo_messaging._drivers.amqpdriver 
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver [-] Failed to process incoming message, retrying...
(most recent call last):
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 332, in poll
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver     self.conn.consume(timeout=current_timeout)
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 1088, in consume
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver     error_callback=_error_callback)
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 810, in ensure
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver     except kombu.exceptions.OperationalError as exc:
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver AttributeError: 'module' object has no attribute 'OperationalError'
2017-05-15 19:20:08.339 2382 ERROR oslo_messaging._drivers.amqpdriver 
2017-05-15 19:20:08.341 2382 ERROR oslo_messaging._drivers.amqpdriver [-] Failed to process incoming message, retrying...
2017-05-15 19:20:08.341 2382 ERROR oslo_messaging._drivers.amqpdriver Traceback (most recent call last):


Same errors in openvswitch-agent.log

Comment 1 Artem Hrechanychenko 2017-05-16 15:29:52 UTC
I also got that issue.

Comment 2 John Eckersberg 2017-05-16 15:56:16 UTC
Looks like kombu is too old:

python-kombu-3.0.32-2.el7ost.noarch

yet from pike requirements in oslo.messaging:

kombu!=4.0.2,>=4.0.0 # BSD

Fixing component.

Comment 4 Alexander Chuzhoy 2017-06-16 14:46:20 UTC
Environment:
openstack-tripleo-heat-templates-7.0.0-0.20170512193554.el7ost.noarch
instack-undercloud-7.0.0-0.20170503001109.el7ost.noarch
openstack-puppet-modules-10.0.0-0.20170315222135.0333c73.el7.1.noarch

The issue doesn't reproduce.

Comment 6 Alexander Chuzhoy 2017-06-19 21:33:48 UTC
Verified:
Environment: python-kombu-4.0.2-5.el7ost.noarch

The reported issue doesn't reproduce. 


[stack@undercloud-0 ~]$ uptime
 17:33:27 up  4:20,  3 users,  load average: 0.11, 0.21, 0.23
[stack@undercloud-0 ~]$ sudo du -hs /var/log/neutron/
23M     /var/log/neutron/

Comment 10 errata-xmlrpc 2017-12-13 21:27:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462