Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1449986

Summary: Ceilometer event collection is consuming huge space.
Product: Red Hat OpenStack Reporter: VIKRANT <vaggarwa>
Component: openstack-ceilometerAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED DUPLICATE QA Contact: Sasha Smolyak <ssmolyak>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: anande, jdanjou, jraju, jruzicka, lmiccini, mabaakou, mschuppe, pgrist, pkilambi, srevivo, vaggarwa, yprokule
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-11 13:25:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description VIKRANT 2017-05-11 10:01:31 UTC
Description of problem:

Following TTL values are set for ceilometer:

# grep 'time_to_live' /etc/ceilometer/ceilometer.conf
# Deprecated group/name - [database]/time_to_live
#metering_time_to_live = -1
metering_time_to_live=604800
#event_time_to_live = -1
event_time_to_live=604800

# mongo 172.xx.xx.xx
tripleo:PRIMARY> use ceilometer
switched to db ceilometer
tripleo:PRIMARY> show dbs
admin         (empty)
ceilometer  135.888GB
local        10.073GB

Checking the size of collections present in ceilometer DB. 

~~~
tripleo:PRIMARY> db.meter.dataSize()
0
tripleo:PRIMARY> db.event.dataSize()
127720004816

tripleo:PRIMARY> db.resources.dataSize()

tripleo:PRIMARY> db.system.indexes.dataSize()
1984
tripleo:PRIMARY>
~~~

Checking the various size values:

~~~
tripleo:PRIMARY> db.event.totalSize()
141645319760
tripleo:PRIMARY> db.event.totalIndexSize()
12139479520
tripleo:PRIMARY> db.event.storageSize()
129505840240
~~~

TTL value is set properly.

~~~
tripleo:PRIMARY> db.meter.getIndexes()
[
        {
                "v" : 1,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "ceilometer.meter"
        },
        {
                "v" : 1,
                "key" : {
                        "resource_id" : 1,
                        "user_id" : 1,
                        "counter_name" : 1,
                        "timestamp" : 1
                },
                "name" : "meter_idx",
                "ns" : "ceilometer.meter",
                "background" : false
        },
        {
                "v" : 1,
                "key" : {
                        "resource_id" : 1,
                        "project_id" : 1,
                        "counter_name" : 1,
                        "timestamp" : 1
                },
                "name" : "meter_project_idx",
                "ns" : "ceilometer.meter",
                "background" : true
        },
        {
                "v" : 1,
                "key" : {
                        "timestamp" : -1
                },
                "name" : "timestamp_idx",
                "ns" : "ceilometer.meter"
        },
        {
                "v" : 1,
                "key" : {
                        "timestamp" : 1
                },
                "name" : "meter_ttl",
                "ns" : "ceilometer.meter",
                "expireAfterSeconds" : 604800
        }
]
~~~

Stats for collection and db

~~~
tripleo:PRIMARY> db.stats()
{
        "db" : "ceilometer",
        "collections" : 5,
        "objects" : 62871062,
        "avgObjSize" : 2031.5606271133133,
        "dataSize" : 127726374144,
        "storageSize" : 129505873008,
        "numExtents" : 85,
        "indexes" : 12,
        "indexSize" : 12140869440,
        "fileSize" : 145891524608,
        "nsSizeMB" : 16,
        "dataFileVersion" : {
                "major" : 4,
                "minor" : 5
        },
        "extentFreeList" : {
                "num" : 0,
                "totalSize" : 0
        },
        "ok" : 1
}


tripleo:PRIMARY> db.meter.stats()
{
        "ns" : "ceilometer.meter",
        "count" : 0,
        "size" : 0,
        "storageSize" : 8192,
        "numExtents" : 1,
        "nindexes" : 5,
        "lastExtentSize" : 8192,
        "paddingFactor" : 1,
        "systemFlags" : 1,
        "userFlags" : 1,
        "totalIndexSize" : 40880,
        "indexSizes" : {
                "_id_" : 8176,
                "meter_idx" : 8176,
                "meter_project_idx" : 8176,
                "timestamp_idx" : 8176,
                "meter_ttl" : 8176
        },
        "ok" : 1
}

tripleo:PRIMARY> db.event.stats()
{
        "ns" : "ceilometer.event",
        "count" : 62864040,
        "size" : 127712159264,
        "avgObjSize" : 2031,
        "storageSize" : 129505840240,
        "numExtents" : 81,
        "nindexes" : 3,
        "lastExtentSize" : 2146426864,
        "paddingFactor" : 1,
        "systemFlags" : 1,
        "userFlags" : 1,
        "totalIndexSize" : 12139798384,
        "indexSizes" : {
                "_id_" : 4929416688,
                "event_type_idx" : 4898217072,
                "event_ttl" : 2312164624
        },
        "ok" : 1
}
~~~

Overall db stats:

~~~
tripleo:PRIMARY> db.serverStatus()
{
        "host" : "chn2-ctrl-0.localdomain",
        "version" : "2.6.11",
        "process" : "mongod",
        "pid" : NumberLong(75752),
        "uptime" : 660779,
        "uptimeMillis" : NumberLong(660779186),
        "uptimeEstimate" : 656393,
        "localTime" : ISODate("2017-05-11T09:09:09.860Z"),
        "asserts" : {
                "regular" : 0,
                "warning" : 0,
                "msg" : 0,
                "user" : 1399,
                "rollovers" : 0
        },
        "backgroundFlushing" : {
                "flushes" : 11012,
                "total_ms" : 483394,
                "average_ms" : 43.897021431166,
                "last_ms" : 0,
                "last_finished" : ISODate("2017-05-11T09:08:11.491Z")
        },
        "connections" : {
                "current" : 199,
                "available" : 51001,
                "totalCreated" : NumberLong(404153)
        },
        "cursors" : {
                "note" : "deprecated, use server status metrics",
                "clientCursors_size" : 1,
                "totalOpen" : 1,
                "pinned" : 0,
                "totalNoTimeout" : 0,
                "timedOut" : 3
        },
        "dur" : {
                "commits" : 30,
                "journaledMB" : 0,
                "writeToDataFilesMB" : 0,
                "compression" : 0,
                "commitsInWriteLock" : 0,
                "earlyCommits" : 0,
                "timeMs" : {
                        "dt" : 3067,
                        "prepLogBuffer" : 0,
                        "writeToJournal" : 0,
                        "writeToDataFiles" : 0,
                        "remapPrivateView" : 0
                }
        },
        "extra_info" : {
                "note" : "fields vary by platform",
                "heap_usage_bytes" : 96312136,
                "page_faults" : 21566
        },
        "globalLock" : {
                "totalTime" : NumberLong("660779187000"),
                "lockTime" : NumberLong(504684271),
                "currentQueue" : {
                        "total" : 0,
                        "readers" : 0,
                        "writers" : 0
                },
                "activeClients" : {
                        "total" : 0,
                        "readers" : 0,
                        "writers" : 0
                }
        },
        "indexCounters" : {
                "accesses" : 1477259438,
                "hits" : 1477259438,
                "misses" : 0,
                "resets" : 0,
                "missRatio" : 0
        },
        "locks" : {
                "." : {
                        "timeLockedMicros" : {
                                "R" : NumberLong(239944435),
                                "W" : NumberLong(504684271)
                        },
                        "timeAcquiringMicros" : {
                                "R" : NumberLong(363352484),
                                "W" : NumberLong(40071003)
                        }
                },
                "admin" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(324419),
                                "w" : NumberLong(0)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(15102),
                                "w" : NumberLong(0)
                        }
                },
                "local" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong("2906631542"),
                                "w" : NumberLong(1934915446)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(947306796),
                                "w" : NumberLong(127960383)
                        }
                },
                "ceilometer" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(2111652),
                                "w" : NumberLong("7198364884")
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(350673512),
                                "w" : NumberLong("72820101758")
                        }
                },
                "meter" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(242),
                                "w" : NumberLong(0)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(61),
                                "w" : NumberLong(0)
                        }
                }
        },
        "network" : {
                "bytesIn" : 120591332234,
                "bytesOut" : 205278085081,
                "numRequests" : 278079450
        },
        "opcounters" : {
                "insert" : 63190877,
                "query" : 394075,
                "update" : 609006,
                "delete" : 0,
                "getmore" : 101139880,
                "command" : 113207296
        },
        "opcountersRepl" : {
                "insert" : 0,
                "query" : 0,
                "update" : 0,
                "delete" : 0,
                "getmore" : 0,
                "command" : 0
        },
        "recordStats" : {
                "accessesNotInMemory" : 17228,
                "pageFaultExceptionsThrown" : 3678,
                "admin" : {
                        "accessesNotInMemory" : 0,
                        "pageFaultExceptionsThrown" : 0
                },
                "ceilometer" : {
                        "accessesNotInMemory" : 4258,
                        "pageFaultExceptionsThrown" : 3678
                },
                "local" : {
                        "accessesNotInMemory" : 12970,
                        "pageFaultExceptionsThrown" : 0
                },
                "meter" : {
                        "accessesNotInMemory" : 0,
                        "pageFaultExceptionsThrown" : 0
                }
        },
        "repl" : {
                "setName" : "tripleo",
                "setVersion" : 1,
                "ismaster" : true,
                "secondary" : false,
                "hosts" : [
                        "172.21.9.26:27017",
                        "172.21.9.25:27017",
                        "172.21.9.18:27017"
                ],
                "primary" : "172.21.9.26:27017",
                "me" : "172.21.9.26:27017",
                "electionId" : ObjectId("590a154484c7e0373569d3e3")
        },
        "writeBacksQueued" : false,
        "mem" : {
                "bits" : 64,
                "resident" : 14552,
                "virtual" : 299891,
                "supported" : true,
                "mapped" : 149464,
                "mappedWithJournal" : 298928
        },
        "metrics" : {
                "cursor" : {
                        "timedOut" : NumberLong(3),
                        "open" : {
                                "noTimeout" : NumberLong(0),
                                "pinned" : NumberLong(0),
                                "total" : NumberLong(1)
                        }
                },
                "document" : {
                        "deleted" : NumberLong(0),
                        "inserted" : NumberLong(63190178),
                        "returned" : NumberLong(127309413),
                        "updated" : NumberLong(609006)
                },
                "getLastError" : {
                        "wtime" : {
                                "num" : 0,
                                "totalMillis" : 0
                        },
                        "wtimeouts" : NumberLong(0)
                },
                "operation" : {
                        "fastmod" : NumberLong(608225),
                        "idhack" : NumberLong(0),
                        "scanAndOrder" : NumberLong(0)
                },
                "queryExecutor" : {
                        "scanned" : NumberLong(0),
                        "scannedObjects" : NumberLong(0)
                },
                "record" : {
                        "moves" : NumberLong(0)
                },
                "repl" : {
                        "apply" : {
                                "batches" : {
                                        "num" : 0,
                                        "totalMillis" : 0
                                },
                                "ops" : NumberLong(0)
                        },
                        "buffer" : {
                                "count" : NumberLong(0),
                                "maxSizeBytes" : 268435456,
                                "sizeBytes" : NumberLong(0)
                        },
                        "network" : {
                                "bytes" : NumberLong(0),
                                "getmores" : {
                                        "num" : 0,
                                        "totalMillis" : 0
                                },
                                "ops" : NumberLong(0),
                                "readersCreated" : NumberLong(12)
                        },
                        "preload" : {
                                "docs" : {
                                        "num" : 0,
                                        "totalMillis" : 0
                                },
                                "indexes" : {
                                        "num" : 0,
                                        "totalMillis" : 0
                                }
                        }
                },
                "storage" : {
                        "freelist" : {
                                "search" : {
                                        "bucketExhausted" : NumberLong(0),
                                        "requests" : NumberLong(64681356),
                                        "scanned" : NumberLong(129175915)
                                }
                        }
                },
                "ttl" : {
                        "deletedDocuments" : NumberLong(331573),
                        "passes" : NumberLong(11011)
                }
        },
        "ok" : 1
}
~~~


Verified that TTL thread is running.

~~~
tripleo:PRIMARY> db.getSiblingDB('admin').runCommand({getParameter: 1, ttlMonitorEnabled: 1})
{ "ttlMonitorEnabled" : true, "ok" : 1 }
~~~

Version-Release number of selected component (if applicable):
RHEL OSP 10 

[root@chn2-ctrl-2 ~]# rpm -qa | grep -i ceilometer
python-ceilometermiddleware-0.5.1-1.el7ost.noarch
openstack-ceilometer-api-7.0.1-1.el7ost.noarch
python-ceilometerclient-2.6.2-1.el7ost.noarch
openstack-ceilometer-compute-7.0.1-1.el7ost.noarch
python-ceilometer-7.0.1-1.el7ost.noarch
openstack-ceilometer-collector-7.0.1-1.el7ost.noarch
puppet-ceilometer-9.5.0-1.el7ost.noarch
openstack-ceilometer-common-7.0.1-1.el7ost.noarch
openstack-ceilometer-polling-7.0.1-1.el7ost.noarch
openstack-ceilometer-notification-7.0.1-1.el7ost.noarch
openstack-ceilometer-central-7.0.1-1.el7ost.noarch


How reproducible:

In every 7 days for Cu. 


Steps to Reproduce:
1.
2.
3.

Actual results:
Events are getting filled up very fast. 

Expected results:
Event should not get filled up at such fast pace. 

Additional info:

Verified that TTL value is set properly and it's working fine. Nothing is shown in following outputs.

~~~
tripleo:PRIMARY> db.event.find({"timestamp" : {$lt : ISODate("2017-05-01T01:20:00Z")}})
tripleo:PRIMARY> db.event.find({"timestamp" : {$lt : ISODate("2017-05-03T01:20:00Z")}})
tripleo:PRIMARY> db.event.find({"timestamp" : {$lt : ISODate("2017-05-04T01:20:00Z")}})
~~~

Following command does show the events in output.

~~~
tripleo:PRIMARY> db.event.find({"timestamp" : {$lt : ISODate("2017-05-05T01:20:00Z")}})
~~~

Comment 13 Mehdi ABAAKOUK 2017-05-12 10:56:05 UTC
We found that the number of objectstore.http.request event was unexpected high.
 
The high volume of objectstore.http.request suggests that ceilometermiddleware for swift is misconfigured. (Because ceilometer send sample to Gnocchi that write data to swift creating a infinite loop of event)

After checking swift-proxy.conf, we found that "ignore_projects = <gnocchi_project_ID>" is missing in section [filter:ceilometer].

Comment 24 Julien Danjou 2017-05-12 17:50:10 UTC
This is a duplicate https://bugzilla.redhat.com/show_bug.cgi?id=1416546 FWIW.

Comment 41 Julien Danjou 2017-08-11 13:25:44 UTC
Yes this ends up being a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1416546 since the root cause was Swift overloading Ceilometer with events.

*** This bug has been marked as a duplicate of bug 1416546 ***