Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Description of problem: upstart is too generous about restarting broken daemons Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. start cluster 2. /var/log/ceph directory mount read only on a mon node 3. watch ceph-mon repeatedly restart Actual results: ceph-mon repeatedly restarts Expected results: ceph-mon repeatedly restarts for a bit, and then remains dead Additional info:
Fix will be in non-RHEL Ceph v0.80.8.5
verified and the fix works fine. 1. sudo pkill -9 -f 'ceph -i 0' - kill osd.0 2. wait for 30 seconds 3. look for upstart restarting the daemons repeat the above steps 2 more times and then upstart will stop restarting the daemon. later, to bring up the osd.0, use "sudo start ceph-osd id=0". upstart should not restart daemons, when killed more than 3 times within 30 minute time frame.
if after upgrading from rh ceph 1.2.3 to 1.2.3-2 or 1.2.3 to 1.2.3-1 to 1.2.3-2 , the fix doesnt work, reboot the cluster once and retry.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:1572