.The Ceph Manager no longer crashes during large increases to `pg_num` and `pgp_num`
Previously, the code that adjusts placement groups did not handle large increases to `pg_num` and `pgp_num` parameters correctly, and led to an integer underflow that can crash the Ceph Manager.
With this release, the code that adjusts placement groups was fixed. As a result, large increases to placement groups do not cause the Ceph Manager to crash.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Red Hat Ceph Storage 5.0 Bug Fix update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2021:4105
Description of problem: All three MGR daemon crashed on the same time with the same abort message - ** Caught signal (Aborted) ** in thread 7f4117eb8700 thread_name:safe_timer Version-Release number of selected component (if applicable): RHCS 5 16.2.0-117.el8cp 0> 2021-09-03T20:45:37.923+0000 7f4117eb8700 -1 *** Caught signal (Aborted) ** in thread 7f4117eb8700 thread_name:safe_timer ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7f43aca0bb20] 2: gsignal() 3: abort() 4: /usr/bin/ceph-mgr(+0x154588) [0x55d650d08588] 5: (DaemonServer::adjust_pgs()+0x3f04) [0x55d650dc0c94] 6: (DaemonServer::tick()+0x103) [0x55d650dc5673] 7: (Context::complete(int)+0xd) [0x55d650d50c4d] 8: (SafeTimer::timer_thread()+0x1b7) [0x7f43adf0dc67] 9: (SafeTimerThread::entry()+0x11) [0x7f43adf0f241] 10: /lib64/libpthread.so.0(+0x814a) [0x7f43aca0114a] 11: clone()