Bug 1492141 - Major upgrade documentation: Mention that the 'metrics' pool needs to be created if Ceph is used
Summary: Major upgrade documentation: Mention that the 'metrics' pool needs to be crea...
Keywords:
Status: CLOSED DUPLICATE of bug 1412295
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: RHOS Documentation Team
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-15 15:04 UTC by Andreas Karis
Modified: 2020-12-14 10:08 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-18 14:40:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andreas Karis 2017-09-15 15:04:39 UTC
Major upgrade documentation: Please mention that the 'metrics' pool needs to be created if Ceph is used

Hi,

I recently had a case where a customer upgraded 2 environments to OSP 10. 

What happened is that:

The absence of the "metrics" pool in RHOSP after an upgrade of OSP 9 to OSP 10 caused gnocchi to go crazy and create too many connections to redis. This exhausted the number of sockets on the system due to too many sockets in TIME_WAIT. This led rabbitmq to fail, because it couldn't open new ports. It also made httpd hang.


Should I create a new bugzilla, or can we assign this ticket to tripleo or documentation (or should we keep it on gnocchi) so that we create a 'metrics' pool if this is internal ceph or the customer knows that they have to create a 'metrics' pool if this is external ceph.

The exact number of PGs depends on the environment, but in this specific case, we ran this command against ceph to create the metrics pool:
~~~
ceph osd pool create metrics 64 64
~~~

Then, the error messages in gnocchi went away (tail -f /var/log/gnocchi/metricd.log on the controllers)

As a reminder, we saw this error message:
++++++++++++++++++++++++
From /var/log/gnocchi/metricd.log
~~~
2017-09-13 15:17:45.479 603330 INFO gnocchi.storage.ceph [-] Ceph storage backend use 'cradox' python library
2017-09-13 15:17:45.487 603330 ERROR cotyledon [-] Unhandled exception
2017-09-13 15:17:45.487 603330 ERROR cotyledon Traceback (most recent call last):
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/cotyledon/__init__.py", line 52, in _exit_on_exception
2017-09-13 15:17:45.487 603330 ERROR cotyledon     yield
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/cotyledon/__init__.py", line 130, in _run
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self.run()
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 92, in run
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self._configure()
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 87, in _configure
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self.store = storage.get_driver(self.conf)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 159, in get_driver
2017-09-13 15:17:45.487 603330 ERROR cotyledon     return get_driver_class(conf)(conf.storage)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "/usr/lib/python2.7/site-packages/gnocchi/storage/ceph.py", line 99, in __init__
2017-09-13 15:17:45.487 603330 ERROR cotyledon     self.ioctx = self.rados.open_ioctx(self.pool)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "cradox.pyx", line 413, in cradox.requires.wrapper.validate_func (cradox.c:4188)
2017-09-13 15:17:45.487 603330 ERROR cotyledon   File "cradox.pyx", line 1047, in cradox.Rados.open_ioctx (cradox.c:12325)
2017-09-13 15:17:45.487 603330 ERROR cotyledon ObjectNotFound: error opening pool 'metrics'
2017-09-13 15:17:45.487 603330 ERROR cotyledon
~~~
++++++++++++++++++++++++


Note You need to log in before you can comment on or make changes to this bug.