Bug 1394544

Summary: collector log inflated to 21Gb on controller, services are down
Product: Red Hat OpenStack Reporter: Ronnie Rasouli <rrasouli>
Component: openstack-ceilometerAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED ERRATA QA Contact: Yurii Prokulevych <yprokule>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: jruzicka, jschluet, pkilambi, rrasouli, srevivo
Target Milestone: rcKeywords: Triaged
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-ceilometer-7.0.0-3.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 16:32:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ronnie Rasouli 2016-11-13 10:21:06 UTC
Description of problem:
Running openstack RHOS10 for a week, the ceilometer on overcloud, telemetry on undercloud enabled, on overcloud was not enabled. 

Collector log inflated to 21Gb
ls -lh /var/log/ceilometer/collector.log
-rw-r--r--. 1 ceilometer ceilometer 21G Nov 13 10:14 /var/log/ceilometer/collector.log


many of the messages
2016-11-09 03:10:12.910 27254 DEBUG oslo_messaging._drivers.amqpdriver [-] received message with unique_id: e40715f95f2d4f37b5af5ab49261b8db __call__ /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:196

Services 

Version-Release number of selected component (if applicable):

RHOS

How reproducible:


Steps to Reproduce:
1. deploy overcloud RHOS10
2. let it run on debug for 5 days
3. track the collector log

Actual results:

Services stopped (keystone, nova, swift

Expected results:

logrotate and purging of old logs

Additional info:

# cat /etc/logrotate.d/openstack-aodh 
compress

/var/log/aodh/*.log {
    weekly
    rotate 4
    size 10M
    missingok
    compress
    minsize 100k
}

cat /etc/logrotate.d/openstack-ceilometer 
compress

/var/log/ceilometer/*.log {
    rotate 14
    size 10M
    missingok
    compress
    copytruncate
}

Comment 1 Ronnie Rasouli 2016-11-13 11:16:48 UTC
Deleting 
df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda2        30G   30G   20K 100% /
devtmpfs        5.8G     0  5.8G   0% /dev
tmpfs           5.8G   54M  5.8G   1% /dev/shm
tmpfs           5.8G  688K  5.8G   1% /run
tmpfs           5.8G     0  5.8G   0% /sys/fs/cgroup
tmpfs           1.2G     0  1.2G   0% /run/user/981
tmpfs           1.2G     0  1.2G   0% /run/user/983
tmpfs           1.2G     0  1.2G   0% /run/user/166
tmpfs           1.2G     0  1.2G   0% /run/user/1000
tmpfs           1.2G     0  1.2G   0% /run/user/0

Comment 2 Ronnie Rasouli 2016-11-13 11:21:05 UTC
rpm -qa | grep ceilo
openstack-ceilometer-central-7.0.0-2.2.el7ost.noarch
python-ceilometermiddleware-0.5.0-1.el7ost.noarch
python-ceilometerclient-2.6.1-2.el7ost.noarch
openstack-ceilometer-polling-7.0.0-2.2.el7ost.noarch
openstack-ceilometer-api-7.0.0-2.2.el7ost.noarch
openstack-ceilometer-common-7.0.0-2.2.el7ost.noarch
openstack-ceilometer-compute-7.0.0-2.2.el7ost.noarch
puppet-ceilometer-9.4.0-2.el7ost.noarch
python-ceilometer-7.0.0-2.2.el7ost.noarch
openstack-ceilometer-notification-7.0.0-2.2.el7ost.noarch
openstack-ceilometer-collector-7.0.0-2.2.el7ost.noarch
[root@compute-0 ~]# rpm -qa | grep aodh
python-aodh-3.0.0-1.1.el7ost.noarch
python-aodhclient-0.7.0-1.el7ost.noarch
openstack-aodh-notifier-3.0.0-1.1.el7ost.noarch
openstack-aodh-evaluator-3.0.0-1.1.el7ost.noarch
openstack-aodh-common-3.0.0-1.1.el7ost.noarch
openstack-aodh-api-3.0.0-1.1.el7ost.noarch
openstack-aodh-listener-3.0.0-1.1.el7ost.noarch
puppet-aodh-9.4.0-2.el7ost.noarch

Comment 9 Ronnie Rasouli 2016-11-28 13:09:59 UTC
Issue seems to be resolved.

logrotate -v /etc/logrotate.d/openstack-ceilometer rotate and compress the collector log

Comment 11 errata-xmlrpc 2016-12-14 16:32:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html