Description of problem: when ceilometer is configured to use a mongodb replica set as database backend and the master changes, an AutoReconnect exception is raised, but not handled, causing internal server errors. Version-Release number of selected component (if applicable): python-ceilometer-2014.2.1-1.el7ost.noarch How reproducible: always Steps to Reproduce: 1. Setup ceilometer to use a mongodb replica set, e.g. [database] connection = mongodb://10.1.16.196,10.1.16.198,10.1.16.200:27017/ceilometer?replicaSet=ceilometer 2. ceilometer statistics -m vcpus 3. systemctl stop mongod on master node 4. ceilometer statistics -m vcpus Actual results: Internal Server Error (HTTP 500) Expected results: vcpus statistics should be displayed. Additional info: This issue is described upstream in https://bugs.launchpad.net/ceilometer/+bug/1309555 . Fix for master is https://review.openstack.org/122387 , Juno backport (under review) is https://review.openstack.org/137133.
Reproduces here as well. Installed with packstack mongodb is not up and this is why you receive the 500: # ceilometer --debug statistics -m vcpus DEBUG (session) REQ: curl -i -X GET http://10.35.117.4:5000/v2.0/ -H "Accept: application/json" -H "User-Agent: python-keystoneclient" DEBUG (session) RESP: [200] {'date': 'Thu, 12 Feb 2015 12:15:21 GMT', 'content-type': 'application/json', 'content-length': '421', 'vary': 'X-Auth-Token'} RESP BODY: {"version": {"status": "stable", "updated": "2014-04-17T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v2.0+json"}, {"base": "application/xml", "type": "application/vnd.openstack.identity-v2.0+xml"}], "id": "v2.0", "links": [{"href": "http://10.35.117.4:5000/v2.0/", "rel": "self"}, {"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}]}} DEBUG (v2) Making authentication request to http://10.35.117.4:5000/v2.0/tokens DEBUG (client) REQ: curl -i -X 'GET' 'http://10.35.117.4:8777/v2/meters/vcpus/statistics' -H 'User-Agent: ceilometerclient.openstack.common.apiclient' -H 'X-Auth-Token: 242e417927a940359d7002e15b2009ea' DEBUG (client) RESP: [500] {'date': 'Thu, 12 Feb 2015 12:15:21 GMT', 'content-length': '144', 'content-type': 'application/json', 'server': 'WSGIServer/0.1 Python/2.7.5'} DEBUG (client) RESP BODY: {"error_message": {"debuginfo": null, "faultcode": "Server", "faultstring": "could not connect to 10.35.117.4:27017: [Errno 111] ECONNREFUSED"}} DEBUG (client) Request returned failure status: 500 Traceback (most recent call last): File "/usr/bin/ceilometer", line 10, in <module> sys.exit(main()) File "/usr/lib/python2.7/site-packages/ceilometerclient/shell.py", line 275, in main CeilometerShell().main(args) File "/usr/lib/python2.7/site-packages/ceilometerclient/shell.py", line 224, in main args.func(client, args) File "/usr/lib/python2.7/site-packages/ceilometerclient/v2/shell.py", line 93, in do_statistics statistics = cc.statistics.list(**api_args) File "/usr/lib/python2.7/site-packages/ceilometerclient/v2/statistics.py", line 54, in list q, p)) File "/usr/lib/python2.7/site-packages/ceilometerclient/common/base.py", line 59, in _list resp = self.api.get(url) File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 334, in get return self.client_request("GET", url, **kwargs) File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 328, in client_request self, method, url, **kwargs) File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 244, in client_request method, self.concat_url(endpoint, url), **kwargs) File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 184, in request raise exceptions.from_response(resp, method, url) ceilometerclient.openstack.common.apiclient.exceptions.InternalServerError: Internal Server Error (HTTP 500) running mongodb manually: # mongod mongod --help for help and startup options 2015-02-12T14:13:31.635+0200 [initandlisten] MongoDB starting : pid=41206 port=27017 dbpath=/data/db 64-bit host=lynx02.qa.lab.tlv.redhat.com 2015-02-12T14:13:31.635+0200 [initandlisten] db version v2.6.5 2015-02-12T14:13:31.636+0200 [initandlisten] git version: nogitversion 2015-02-12T14:13:31.636+0200 [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013 2015-02-12T14:13:31.636+0200 [initandlisten] build info: Linux x86-025.build.eng.bos.redhat.com 2.6.32-504.1.3.el6.x86_64 #1 SMP Fri Oct 31 11:37:10 EDT 2014 x86_64 BOOST_LIB_VERSION=1_53 2015-02-12T14:13:31.636+0200 [initandlisten] allocator: tcmalloc 2015-02-12T14:13:31.636+0200 [initandlisten] options: {} 2015-02-12T14:13:31.636+0200 [initandlisten] exception in initAndListen: 10296 ********************************************************************* ERROR: dbpath (/data/db) does not exist. Create this directory or give existing directory in --dbpath. See http://dochub.mongodb.org/core/startingandstoppingmongo ********************************************************************* , terminating 2015-02-12T14:13:31.636+0200 [initandlisten] dbexit: 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: going to close listening sockets... 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: going to flush diaglog... 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: going to close sockets... 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: waiting for fs preallocator... 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: lock for final commit... 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: final commit... 2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: closing all files... 2015-02-12T14:13:31.636+0200 [initandlisten] closeAllFiles() finished 2015-02-12T14:13:31.636+0200 [initandlisten] dbexit: really exiting now
Furthermore, When I restart mongodb service it gets 'up' OK with 'service mongod start' but will crash as soon as above command is called: [root@lynx02 mongodb(keystone_admin)]# tail -f mongodb.log 2015-02-12T14:31:45.467+0200 [initandlisten] ** http://dochub.mongodb.org/core/readahead 2015-02-12T14:31:45.500+0200 [initandlisten] journal dir=/var/lib/mongodb/journal 2015-02-12T14:31:45.500+0200 [initandlisten] recover begin 2015-02-12T14:31:45.500+0200 [initandlisten] recover lsn: 0 2015-02-12T14:31:45.500+0200 [initandlisten] recover /var/lib/mongodb/journal/j._0 2015-02-12T14:31:45.573+0200 [initandlisten] recover cleaning up 2015-02-12T14:31:45.573+0200 [initandlisten] removeJournalFiles 2015-02-12T14:31:45.625+0200 [initandlisten] recover done 2015-02-12T14:31:45.625+0200 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.0 2015-02-12T14:31:46.561+0200 [initandlisten] waiting for connections on port 27017 2015-02-12T14:33:17.472+0200 [conn2] SEVERE: Invalid access at address: 0x20 2015-02-12T14:33:17.486+0200 [conn2] SEVERE: Got signal: 11 (Segmentation fault). Backtrace:0x12f7913 0x12f50fe 0x12f51d3 0x7fb8a60cf130 0x7fb8a77a9eef 0x7fb8a7847f5b 0x7fb8a77aa80e 0x7fb8a77aae09 0x7fb8a781fb5e 0x7fb8a7790ce2 0x7fb8a781ff71 0x7fb8a7822fa1 0x125b7a1 0x1267c7d 0x1255bc1 0x1253062 0xd1ad53 0xd1ed89 0xd79347 0xd7a1e1 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x27) [0x12f7913] /usr/bin/mongod() [0x12f50fe] /usr/bin/mongod() [0x12f51d3] /lib64/libpthread.so.0(+0xf130) [0x7fb8a60cf130] /lib64/libv8.so.3(_ZN2v88internal2OS8AllocateEmPmb+0x8f) [0x7fb8a77a9eef] /lib64/libv8.so.3(_ZN2v88internal28CreateTranscendentalFunctionENS0_19TranscendentalCache4TypeE+0x3b) [0x7fb8a7847f5b] /lib64/libv8.so.3(_ZN2v88internal22init_fast_sin_functionEv+0xe) [0x7fb8a77aa80e] /lib64/libv8.so.3(_ZN2v88internal14POSIXPostSetUpEv+0x9) [0x7fb8a77aae09] /lib64/libv8.so.3(_ZN2v88internal2V828InitializeOncePerProcessImplEv+0x3e) [0x7fb8a781fb5e] /lib64/libv8.so.3(_ZN2v88internal12CallOnceImplEPlPFvPvES2_+0x52) [0x7fb8a7790ce2] /lib64/libv8.so.3(_ZN2v88internal2V810InitializeEPNS0_12DeserializerE+0x11) [0x7fb8a781ff71] /lib64/libv8.so.3(_ZN2v86LockerC2EPNS_7IsolateE+0x61) [0x7fb8a7822fa1] /usr/bin/mongod(_ZN5mongo7V8ScopeC2EPNS_14V8ScriptEngineE+0x3db) [0x125b7a1] /usr/bin/mongod(_ZN5mongo14V8ScriptEngine11createScopeEv+0x2b) [0x1267c7d] /usr/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x23) [0x1255bc1] /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSsS2_+0x70) [0x1253062] /usr/bin/mongod(_ZN5mongo2mr5State4initEv+0x69) [0xd1ad53] /usr/bin/mongod(_ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x393) [0xd1ed89] /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x8b) [0xd79347] /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xab3) [0xd7a1e1]
After talking with Eoghan on IRC this is probably not related after all so please disregard my comment 3 and 4. I'll open a new bug here.
https://review.openstack.org/122387 landed in time for kilo, so this should be fixed already
I can confirm as well, the fix for this already exists in our downstream repos.
Replica set of 3 nodes. 1. ceilometer --debug statistics -m snapshot.create.end +--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+ | Period | Period Start | Period End | Max | Min | Avg | Sum | Count | Duration | Duration Start | Duration End | +--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+ | 0 | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 | 1.0 | 1.0 | 1.0 | 5.0 | 5 | 1.549 | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 | +--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+ 2. systemctl stop mongod.service 3. ceilometer --debug statistics -m snapshot.create.end DEBUG (session) REQ: curl -g -i -X GET http://127.0.0.1:5000/v2.0 -H "Accept: application/json" -H "User-Agent: python-keystoneclient" INFO (connectionpool) Starting new HTTP connection (1): 127.0.0.1 DEBUG (connectionpool) "GET /v2.0 HTTP/1.1" 200 337 DEBUG (session) RESP: [200] content-length: 337 vary: X-Auth-Token server: Apache/2.4.6 (Red Hat Enterprise Linux) connection: close date: Fri, 26 Jun 2015 11:35:57 GMT content-type: application/json x-openstack-request-id: req-8c061c12-0fc7-410c-8505-14957f86c2de RESP BODY: {"version": {"status": "stable", "updated": "2014-04-17T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v2.0+json"}], "id": "v2.0", "links": [{"href": "http://127.0.0.1:5000/v2.0/", "rel": "self"}, {"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}]}} DEBUG (v2) Making authentication request to http://127.0.0.1:5000/v2.0/tokens INFO (connectionpool) Resetting dropped connection: 127.0.0.1 DEBUG (connectionpool) "POST /v2.0/tokens HTTP/1.1" 200 4434 DEBUG (client) REQ: curl -g -i -X 'GET' 'http://127.0.0.1:8777/v2/meters/snapshot.create.end/statistics' -H 'User-Agent: ceilometerclient.openstack.common.apiclient' -H 'X-Auth-Token: {SHA1}c855658a511ae2c08cc903374eac000275ace4d3' INFO (connectionpool) Starting new HTTP connection (1): 127.0.0.1 DEBUG (connectionpool) "GET /v2/meters/snapshot.create.end/statistics HTTP/1.1" 200 315 DEBUG (client) RESP: [200] {'date': 'Fri, 26 Jun 2015 11:36:07 GMT', 'content-length': '315', 'content-type': 'application/json; charset=UTF-8', 'x-openstack-request-id': 'req-aa0803b2-1e2d-4043-ab61-4e1672a3703e', 'server': 'Werkzeug/0.9.1 Python/2.7.5'} DEBUG (client) RESP BODY: [{"count": 5, "duration_start": "2015-06-26T11:16:42.231000", "min": 1.0, "duration_end": "2015-06-26T11:16:43.780000", "max": 1.0, "sum": 5.0, "period": 0, "period_end": "2015-06-26T11:16:43.780000", "duration": 1.549, "period_start": "2015-06-26T11:16:42.231000", "avg": 1.0, "groupby": null, "unit": "snapshot"}] +--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+ | Period | Period Start | Period End | Max | Min | Avg | Sum | Count | Duration | Duration Start | Duration End | +--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+ | 0 | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 | 1.0 | 1.0 | 1.0 | 5.0 | 5 | 1.549 | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 | +--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+ 4. Excerpt from api.log ... 2015-06-26 13:35:57.876 98877 WARNING ceilometer.storage.mongo.utils [-] Unable to reconnect to the primary mongodb: No replica set primary available for query with ReadPreference PRIMARY. Trying again in 10 seconds. 2015-06-26 13:36:07.914 98877 INFO werkzeug [-] 127.0.0.1 - - [26/Jun/2015 13:36:07] "GET /v2/meters/snapshot.create.end/statistics HTTP/1.1" 200 - ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1548