Bug 1492141

Summary: Major upgrade documentation: Mention that the 'metrics' pool needs to be created if Ceph is used
Product: Red Hat OpenStack Reporter: Andreas Karis <akaris>
Component: documentationAssignee: RHOS Documentation Team <rhos-docs>
Status: CLOSED DUPLICATE QA Contact: RHOS Documentation Team <rhos-docs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: augol, mburns, srevivo, yprokule
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-18 14:40:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Karis 2017-09-15 15:04:39 UTC
Major upgrade documentation: Please mention that the 'metrics' pool needs to be created if Ceph is used

Hi,

I recently had a case where a customer upgraded 2 environments to OSP 10. 

What happened is that:

The absence of the "metrics" pool in RHOSP after an upgrade of OSP 9 to OSP 10 caused gnocchi to go crazy and create too many connections to redis. This exhausted the number of sockets on the system due to too many sockets in TIME_WAIT. This led rabbitmq to fail, because it couldn't open new ports. It also made httpd hang.


Should I create a new bugzilla, or can we assign this ticket to tripleo or documentation (or should we keep it on gnocchi) so that we create a 'metrics' pool if this is internal ceph or the customer knows that they have to create a 'metrics' pool if this is external ceph.

The exact number of PGs depends on the environment, but in this specific case, we ran this command against ceph to create the metrics pool:
~~~
ceph osd pool create metrics 64 64
~~~

Then, the error messages in gnocchi went away (tail -f /var/log/gnocchi/metricd.log on the controllers)

As a reminder, we saw this error message:
++++++++++++++++++++++++
From /var/log/gnocchi/metricd.log
~~~
2017-09-13 15:17:45.479 603330 INFO gnocchi.storage.ceph [-] Ceph storage backend use 'cradox' python library
2017-09-13 15:17:45.487 603330 ERROR cotyledon [-] Unhandled exception
2017-09-13 15:17:45.487 603330 ERROR cotyledon Traceback (most recent call last):
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/cotyledon/__init__.py", line 52, in _exit_on_exception
2017-09-13 15:17:45.487 603330 ERROR cotyledon     yield
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/cotyledon/__init__.py", line 130, in _run
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self.run()
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 92, in run
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self._configure()
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 87, in _configure
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self.store = storage.get_driver(self.conf)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 159, in get_driver
2017-09-13 15:17:45.487 603330 ERROR cotyledon     return get_driver_class(conf)(conf.storage)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/storage/ceph.py", line 99, in __init__
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self.ioctx = self.rados.open_ioctx(self.pool)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "cradox.pyx", line 413, in cradox.requires.wrapper.validate_func (cradox.c:4188)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "cradox.pyx", line 1047, in cradox.Rados.open_ioctx (cradox.c:12325)
2017-09-13 15:17:45.487 603330 ERROR cotyledon ObjectNotFound: error opening pool 'metrics'
2017-09-13 15:17:45.487 603330 ERROR cotyledon
~~~
++++++++++++++++++++++++