Bug 1192010 - Retry connection to mongodb on AutoReconnect exceptions
Summary: Retry connection to mongodb on AutoReconnect exceptions
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ceilometer
Version: 6.0 (Juno)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 7.0 (Kilo)
Assignee: Eoghan Glynn
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-02-12 11:44 UTC by Javier Peña
Modified: 2023-02-22 23:02 UTC (History)
7 users (show)

Fixed In Version: openstack-ceilometer-2015.1.0-2.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-05 13:21:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1309555 0 None None None Never
OpenStack gerrit 137133 0 None None None Never
Red Hat Product Errata RHEA-2015:1548 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory 2015-08-05 17:07:06 UTC

Description Javier Peña 2015-02-12 11:44:57 UTC
Description of problem: when ceilometer is configured to use a mongodb replica set as database backend and the master changes, an AutoReconnect exception is raised, but not handled, causing internal server errors.

Version-Release number of selected component (if applicable):
python-ceilometer-2014.2.1-1.el7ost.noarch

How reproducible: always


Steps to Reproduce:
1. Setup ceilometer to use a mongodb replica set, e.g.
[database]
connection = mongodb://10.1.16.196,10.1.16.198,10.1.16.200:27017/ceilometer?replicaSet=ceilometer

2. ceilometer statistics -m vcpus
3. systemctl stop mongod on master node
4. ceilometer statistics -m vcpus

Actual results:
Internal Server Error (HTTP 500)

Expected results:
vcpus statistics should be displayed.

Additional info:
This issue is described upstream in https://bugs.launchpad.net/ceilometer/+bug/1309555 . Fix for master is https://review.openstack.org/122387 , Juno backport (under review) is https://review.openstack.org/137133.

Comment 3 Amit Ugol 2015-02-12 12:18:57 UTC
Reproduces here as well.
Installed with packstack
mongodb is not up and this is why you receive the 500:

# ceilometer --debug statistics -m vcpus
DEBUG (session) REQ: curl -i -X GET http://10.35.117.4:5000/v2.0/ -H "Accept: application/json" -H "User-Agent: python-keystoneclient"
DEBUG (session) RESP: [200] {'date': 'Thu, 12 Feb 2015 12:15:21 GMT', 'content-type': 'application/json', 'content-length': '421', 'vary': 'X-Auth-Token'} 
RESP BODY: {"version": {"status": "stable", "updated": "2014-04-17T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v2.0+json"}, {"base": "application/xml", "type": "application/vnd.openstack.identity-v2.0+xml"}], "id": "v2.0", "links": [{"href": "http://10.35.117.4:5000/v2.0/", "rel": "self"}, {"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}]}}

DEBUG (v2) Making authentication request to http://10.35.117.4:5000/v2.0/tokens
DEBUG (client) REQ: curl -i -X 'GET' 'http://10.35.117.4:8777/v2/meters/vcpus/statistics' -H 'User-Agent: ceilometerclient.openstack.common.apiclient' -H 'X-Auth-Token: 242e417927a940359d7002e15b2009ea'
DEBUG (client) RESP: [500] {'date': 'Thu, 12 Feb 2015 12:15:21 GMT', 'content-length': '144', 'content-type': 'application/json', 'server': 'WSGIServer/0.1 Python/2.7.5'}

DEBUG (client) RESP BODY: {"error_message": {"debuginfo": null, "faultcode": "Server", "faultstring": "could not connect to 10.35.117.4:27017: [Errno 111] ECONNREFUSED"}}

DEBUG (client) Request returned failure status: 500
Traceback (most recent call last):
  File "/usr/bin/ceilometer", line 10, in <module>
    sys.exit(main())
  File "/usr/lib/python2.7/site-packages/ceilometerclient/shell.py", line 275, in main
    CeilometerShell().main(args)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/shell.py", line 224, in main
    args.func(client, args)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/v2/shell.py", line 93, in do_statistics
    statistics = cc.statistics.list(**api_args)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/v2/statistics.py", line 54, in list
    q, p))
  File "/usr/lib/python2.7/site-packages/ceilometerclient/common/base.py", line 59, in _list
    resp = self.api.get(url)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 334, in get
    return self.client_request("GET", url, **kwargs)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 328, in client_request
    self, method, url, **kwargs)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 244, in client_request
    method, self.concat_url(endpoint, url), **kwargs)
  File "/usr/lib/python2.7/site-packages/ceilometerclient/openstack/common/apiclient/client.py", line 184, in request
    raise exceptions.from_response(resp, method, url)
ceilometerclient.openstack.common.apiclient.exceptions.InternalServerError: Internal Server Error (HTTP 500)

running mongodb manually:

# mongod 
mongod --help for help and startup options
2015-02-12T14:13:31.635+0200 [initandlisten] MongoDB starting : pid=41206 port=27017 dbpath=/data/db 64-bit host=lynx02.qa.lab.tlv.redhat.com
2015-02-12T14:13:31.635+0200 [initandlisten] db version v2.6.5
2015-02-12T14:13:31.636+0200 [initandlisten] git version: nogitversion
2015-02-12T14:13:31.636+0200 [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2015-02-12T14:13:31.636+0200 [initandlisten] build info: Linux x86-025.build.eng.bos.redhat.com 2.6.32-504.1.3.el6.x86_64 #1 SMP Fri Oct 31 11:37:10 EDT 2014 x86_64 BOOST_LIB_VERSION=1_53
2015-02-12T14:13:31.636+0200 [initandlisten] allocator: tcmalloc
2015-02-12T14:13:31.636+0200 [initandlisten] options: {}
2015-02-12T14:13:31.636+0200 [initandlisten] exception in initAndListen: 10296 
*********************************************************************
 ERROR: dbpath (/data/db) does not exist.
 Create this directory or give existing directory in --dbpath.
 See http://dochub.mongodb.org/core/startingandstoppingmongo
*********************************************************************
, terminating
2015-02-12T14:13:31.636+0200 [initandlisten] dbexit: 
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: going to close listening sockets...
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: going to flush diaglog...
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: going to close sockets...
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: waiting for fs preallocator...
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: lock for final commit...
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: final commit...
2015-02-12T14:13:31.636+0200 [initandlisten] shutdown: closing all files...
2015-02-12T14:13:31.636+0200 [initandlisten] closeAllFiles() finished
2015-02-12T14:13:31.636+0200 [initandlisten] dbexit: really exiting now

Comment 4 Amit Ugol 2015-02-12 12:34:04 UTC
Furthermore, When I restart mongodb service it gets 'up' OK with 'service mongod start' but will crash as soon as above command is called:


[root@lynx02 mongodb(keystone_admin)]# tail -f mongodb.log 
2015-02-12T14:31:45.467+0200 [initandlisten] **          http://dochub.mongodb.org/core/readahead
2015-02-12T14:31:45.500+0200 [initandlisten] journal dir=/var/lib/mongodb/journal
2015-02-12T14:31:45.500+0200 [initandlisten] recover begin
2015-02-12T14:31:45.500+0200 [initandlisten] recover lsn: 0
2015-02-12T14:31:45.500+0200 [initandlisten] recover /var/lib/mongodb/journal/j._0
2015-02-12T14:31:45.573+0200 [initandlisten] recover cleaning up
2015-02-12T14:31:45.573+0200 [initandlisten] removeJournalFiles
2015-02-12T14:31:45.625+0200 [initandlisten] recover done
2015-02-12T14:31:45.625+0200 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.0
2015-02-12T14:31:46.561+0200 [initandlisten] waiting for connections on port 27017
2015-02-12T14:33:17.472+0200 [conn2] SEVERE: Invalid access at address: 0x20
2015-02-12T14:33:17.486+0200 [conn2] SEVERE: Got signal: 11 (Segmentation fault).
Backtrace:0x12f7913 0x12f50fe 0x12f51d3 0x7fb8a60cf130 0x7fb8a77a9eef 0x7fb8a7847f5b 0x7fb8a77aa80e 0x7fb8a77aae09 0x7fb8a781fb5e 0x7fb8a7790ce2 0x7fb8a781ff71 0x7fb8a7822fa1 0x125b7a1 0x1267c7d 0x1255bc1 0x1253062 0xd1ad53 0xd1ed89 0xd79347 0xd7a1e1 
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x27) [0x12f7913]
 /usr/bin/mongod() [0x12f50fe]
 /usr/bin/mongod() [0x12f51d3]
 /lib64/libpthread.so.0(+0xf130) [0x7fb8a60cf130]
 /lib64/libv8.so.3(_ZN2v88internal2OS8AllocateEmPmb+0x8f) [0x7fb8a77a9eef]
 /lib64/libv8.so.3(_ZN2v88internal28CreateTranscendentalFunctionENS0_19TranscendentalCache4TypeE+0x3b) [0x7fb8a7847f5b]
 /lib64/libv8.so.3(_ZN2v88internal22init_fast_sin_functionEv+0xe) [0x7fb8a77aa80e]
 /lib64/libv8.so.3(_ZN2v88internal14POSIXPostSetUpEv+0x9) [0x7fb8a77aae09]
 /lib64/libv8.so.3(_ZN2v88internal2V828InitializeOncePerProcessImplEv+0x3e) [0x7fb8a781fb5e]
 /lib64/libv8.so.3(_ZN2v88internal12CallOnceImplEPlPFvPvES2_+0x52) [0x7fb8a7790ce2]
 /lib64/libv8.so.3(_ZN2v88internal2V810InitializeEPNS0_12DeserializerE+0x11) [0x7fb8a781ff71]
 /lib64/libv8.so.3(_ZN2v86LockerC2EPNS_7IsolateE+0x61) [0x7fb8a7822fa1]
 /usr/bin/mongod(_ZN5mongo7V8ScopeC2EPNS_14V8ScriptEngineE+0x3db) [0x125b7a1]
 /usr/bin/mongod(_ZN5mongo14V8ScriptEngine11createScopeEv+0x2b) [0x1267c7d]
 /usr/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv+0x23) [0x1255bc1]
 /usr/bin/mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSsS2_+0x70) [0x1253062]
 /usr/bin/mongod(_ZN5mongo2mr5State4initEv+0x69) [0xd1ad53]
 /usr/bin/mongod(_ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x393) [0xd1ed89]
 /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x8b) [0xd79347]
 /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xab3) [0xd7a1e1]

Comment 5 Amit Ugol 2015-02-12 13:51:33 UTC
After talking with Eoghan on IRC this is probably not related after all so please disregard my comment 3 and 4.
I'll open a new bug here.

Comment 6 Eoghan Glynn 2015-06-22 13:58:53 UTC
https://review.openstack.org/122387 landed in time for kilo, so this should be fixed already

Comment 7 Pradeep Kilambi 2015-06-23 13:38:04 UTC
I can confirm as well, the fix for this already exists in our downstream repos.

Comment 11 Yurii Prokulevych 2015-06-26 11:41:56 UTC
Replica set of 3 nodes.

1. ceilometer --debug statistics -m snapshot.create.end
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+
| Period | Period Start               | Period End                 | Max | Min | Avg | Sum | Count | Duration | Duration Start             | Duration End               |
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+
| 0      | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 | 1.0 | 1.0 | 1.0 | 5.0 | 5     | 1.549    | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 |
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+

2. systemctl stop mongod.service

3. ceilometer --debug statistics -m snapshot.create.end
DEBUG (session) REQ: curl -g -i -X GET http://127.0.0.1:5000/v2.0 -H "Accept: application/json" -H "User-Agent: python-keystoneclient"
INFO (connectionpool) Starting new HTTP connection (1): 127.0.0.1
DEBUG (connectionpool) "GET /v2.0 HTTP/1.1" 200 337
DEBUG (session) RESP: [200] content-length: 337 vary: X-Auth-Token server: Apache/2.4.6 (Red Hat Enterprise Linux) connection: close date: Fri, 26 Jun 2015 11:35:57 GMT content-type: application/json x-openstack-request-id: req-8c061c12-0fc7-410c-8505-14957f86c2de
RESP BODY: {"version": {"status": "stable", "updated": "2014-04-17T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v2.0+json"}], "id": "v2.0", "links": [{"href": "http://127.0.0.1:5000/v2.0/", "rel": "self"}, {"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}]}}

DEBUG (v2) Making authentication request to http://127.0.0.1:5000/v2.0/tokens
INFO (connectionpool) Resetting dropped connection: 127.0.0.1
DEBUG (connectionpool) "POST /v2.0/tokens HTTP/1.1" 200 4434
DEBUG (client) REQ: curl -g -i -X 'GET' 'http://127.0.0.1:8777/v2/meters/snapshot.create.end/statistics' -H 'User-Agent: ceilometerclient.openstack.common.apiclient' -H 'X-Auth-Token: {SHA1}c855658a511ae2c08cc903374eac000275ace4d3'
INFO (connectionpool) Starting new HTTP connection (1): 127.0.0.1
DEBUG (connectionpool) "GET /v2/meters/snapshot.create.end/statistics HTTP/1.1" 200 315
DEBUG (client) RESP: [200] {'date': 'Fri, 26 Jun 2015 11:36:07 GMT', 'content-length': '315', 'content-type': 'application/json; charset=UTF-8', 'x-openstack-request-id': 'req-aa0803b2-1e2d-4043-ab61-4e1672a3703e', 'server': 'Werkzeug/0.9.1 Python/2.7.5'}

DEBUG (client) RESP BODY: [{"count": 5, "duration_start": "2015-06-26T11:16:42.231000", "min": 1.0, "duration_end": "2015-06-26T11:16:43.780000", "max": 1.0, "sum": 5.0, "period": 0, "period_end": "2015-06-26T11:16:43.780000", "duration": 1.549, "period_start": "2015-06-26T11:16:42.231000", "avg": 1.0, "groupby": null, "unit": "snapshot"}]

+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+
| Period | Period Start               | Period End                 | Max | Min | Avg | Sum | Count | Duration | Duration Start             | Duration End               |
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+
| 0      | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 | 1.0 | 1.0 | 1.0 | 5.0 | 5     | 1.549    | 2015-06-26T11:16:42.231000 | 2015-06-26T11:16:43.780000 |
+--------+----------------------------+----------------------------+-----+-----+-----+-----+-------+----------+----------------------------+----------------------------+

4. Excerpt from api.log
...
2015-06-26 13:35:57.876 98877 WARNING ceilometer.storage.mongo.utils [-] Unable to reconnect to the primary mongodb: No replica set primary available for query with ReadPreference PRIMARY. Trying again in 10 seconds.
2015-06-26 13:36:07.914 98877 INFO werkzeug [-] 127.0.0.1 - - [26/Jun/2015 13:36:07] "GET /v2/meters/snapshot.create.end/statistics HTTP/1.1" 200 -
...

Comment 13 errata-xmlrpc 2015-08-05 13:21:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1548


Note You need to log in before you can comment on or make changes to this bug.