upstream's firefly patch is https://github.com/ceph/ceph/pull/5406 so we'll use that.
steps to reproduce: set logging in ceph.conf debug ms = 20 debug osd = 20 debug filestore = 20 sudo rbd create image --size 1000000000 sudo rbd bench-write image --io-threads 256 --io-size 4096 --io-total 1000000000000 2>&1 >/dev/n while the workload is in progress [in the background], sudo vi /root/.gdbinit set pagination off set target-async on set non-stop on ps -ef | grep ceph-osd - look for pid of osd.0 sudo gdb attach <pid of osd.0> b Log.cc:117 c -a check for sighup and where it breaks sudo ceph osd dump sudo ceph pg dump sudo ceph pg scrub <pg.id> watch for the objects to corrupt [ubuntu@magna016 ~]$ sudo ceph -s cluster 8c89dca4-2ad2-46f9-b38f-d8450a2c6e0a health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean; recovery 3920/34408 objects degraded (11 monmap e1: 1 mons at {magna016=10.8.128.16:6789/0}, election epoch 2, quorum 0 magna016 osdmap e37: 3 osds: 2 up, 2 in pgmap v1723: 193 pgs, 4 pools, 60233 MB data, 15733 objects 106 GB used, 1745 GB / 1852 GB avail 3920/34408 objects degraded (11.393%) 192 active+degraded 1 active+clean client io 2336 kB/s wr, 1168 op/s
Degraded objects aren't what you are looking for. What happened here is the osd died. That might actually have been due to the bug causing corruption in something the osd then read back, or it might just be that the thread stopped by the gdb session eventually caused a timeout to fail and kill the osd. You'll have to try it again and keep the osd log.
For non-RHEL, the fix will be in the Ceph v0.80.8.4 packages.
works fine.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:1527