1223532 – PG::find_best_info exclude incomplete for max_last_epoch_started_found

Bug 1223532 - PG::find_best_info exclude incomplete for max_last_epoch_started_found

Summary: PG::find_best_info exclude incomplete for max_last_epoch_started_found

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	1.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	1.3.1
Assignee:	Samuel Just
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1267129 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-05-20 19:12 UTC by Samuel Just
Modified:	2022-07-09 07:34 UTC (History)
CC List:	9 users (show)
Fixed In Version:	ceph-0.94.2-3.el7cp
Doc Type:	Bug Fix
Doc Text:	In certain scenarios, when all acting set osds for a placement group are restarted during backfills, the OSDs can fail to peer that placement group will go incomplete. With this update, the OSDs now peer as expected.
Clone Of:
Environment:
Last Closed:	2015-11-23 20:21:04 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
osd.0 logs (6.85 MB, application/x-gzip) 2015-11-02 16:27 UTC, shylesh	no flags	Details
osd.0 logs with osd_min_pg_log_entries set to 10 and osd_max_pg_log_entries set to 20 (2.80 MB, application/x-gzip) 2015-11-02 17:06 UTC, shylesh	no flags	Details
osd.0 logs latest (8.83 MB, application/x-gzip) 2015-11-03 07:27 UTC, shylesh	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	11687	None	None	None	Never
Red Hat Issue Tracker	RHCEPH-3497	None	None	None	2022-02-21 18:37:25 UTC
Red Hat Knowledge Base (Solution)	1982473	None	None	None	Never
Red Hat Product Errata	RHSA-2015:2066	normal	SHIPPED_LIVE	Moderate: Red Hat Ceph Storage 1.3.1 security, bug fix, and enhancement update	2015-11-24 02:34:55 UTC
Red Hat Product Errata	RHSA-2015:2512	normal	SHIPPED_LIVE	Moderate: Red Hat Ceph Storage 1.3.1 security, bug fix, and enhancement update	2016-02-03 03:15:52 UTC

Description Samuel Just 2015-05-20 19:12:28 UTC

Description of problem:

PG::find_best_info minor bug

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

I do not think this is a particularly serious bug and should not hold up 1.3.0.  I do think that it should go into an async release once it's marinaded in master for a while.

Comment 2 Samuel Just 2015-07-14 18:13:56 UTC

This has not yet been merged to master (though the PR is tested and ready for review).

Comment 5 Ken Dreyer (Red Hat) 2015-07-15 23:51:30 UTC

I know QE is going to ask for this, so I'll ask it anyway :) Would you please fill in the "Steps to reproduce", "Actual results", and "Expected results"  bits from comment #1, so that Harish & team feel good about their ability to verify the fix? I'm pretty sure that's why they haven't ack'd the bug yet.

Comment 6 Harish NV Rao 2015-07-16 11:54:44 UTC

:-) Thanks Ken :-)

Comment 7 Samuel Just 2015-07-16 16:56:35 UTC

This is going to be nightmarish to reproduce.  You need
1) A pg which is going active with 1 backfill peer
2) The primary osd must be restarted just after it sends the activation message to its peers (replicas and backfill peer), but before it writes the activation to its journal.
3) Any acting set replicas (not the backfill peer) must also die before writing the activation information to their disks.
4) After revival, the backfill peer would be the only osd with the updated info.last_epoch_started value and will cause peering to fail.

The only realistic way to do this would be to use gdb to insert breakpoints at the appropriate places in the code on to stop the processes at the right place on the primaries and the replicas.  If you think this is worth doing, I'll dig up where you need to place breakpoints on the replicas and the primary.

I observed this failure once or twice in rados thrashing tests, and then ran my integration branch with this fix for a while without seeing it recur.  I also saw it happen to a customer earlier this week (though I think they were using size=2 and min_size=1).

Comment 8 Harish NV Rao 2015-07-20 11:44:42 UTC

>>The only realistic way to do this would be to use gdb to insert breakpoints at the appropriate places in the code on to stop the processes at the right place on the primaries and the replicas.  If you think this is worth doing, I'll dig up where you need to place breakpoints on the replicas and the primary

Will there be a 'private binary' to test with the above approach? Can you please share all the details required for running the test with gdb breakpoints?

Comment 9 Samuel Just 2015-07-22 22:15:14 UTC

No private binary should be necessary.  The plan should be:
1) Create a 3 osd cluster with exactly 1 size 2, min_size 1 pool with exactly 1 pg.  Create a crush rule so that the pg will map to 0, 1, and then 2*.  That way, the pg will map to [0,1].  Then, when 1 is set down, it will switch to [0,2].
2) Ensure that the log is set to be very short by setting osd_max_pg_log_entries and osd_min_pg_log_entries on the osds to be 5 and 10.
3) Write 1 byte objects to the pool for 100 seconds.
4) Wait 100 seconds.
5) Set a break point on osd.0 at the end of PG.cc::activate using gdb.  You'll probably have to use git to checkout the exact version of the code used to build the package you are using and get the line number for the last line of that method.
6) Kill osd.1
7) Wait for osd.0 to hit the break point.
8) Kill osd.0
9) Wait 1 minute.
10) Revive osd.0

Without the fix, the pg should fail to go active.  With the fix, it should resume backfilling and then go clean.

* You can accomplish this by creating a special crush rule like:

step take 0
emit
step take 1
emit
step take 2
emit

Comment 10 Samuel Just 2015-07-22 22:22:57 UTC

You'll have to use gdb non-stop mode in step 7 to allow the message sending thread to actually send the message while the thread is stopped:

http://sourceware.org/gdb/current/onlinedocs/gdb/Non_002dStop-Mode.html

Comment 11 Samuel Just 2015-07-22 22:44:49 UTC

Ugh, killing osd.1 isn't quite enough.  You'll have to inject a new crush map as well removing the step take 1; emit piece to get it to remap to 2.

Comment 12 Samuel Just 2015-07-22 23:18:43 UTC

I needed

set pagination off
set target-async on
set non-stop on

to make gdb non-stop mode work.  Note that after attaching to the process you need to do continue -a to make *all* of the threads restart (after setting the breakpoint of course).  The crush rule I used was:

rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take osd.0
        step emit
        step take osd.1
        step emit
        step take osd.2
        step emit
        step take osd.3
        step emit
}

Comment 13 Samuel Just 2015-07-23 00:28:37 UTC

I was able to reproduce on current hammer (which does not yet have the fix), and not on current master (which does).  Here are more notes.  Note: this is *not* a script.  At the least, the pids are from this particular run.  Also, it's pretty finicky, it took me a couple of tries for each of them.

# create pool
sudo ceph pool create test 1 1
sudo ceph osd pool create test 1 1
sudo ceph osd pool set test min_size 1
sudo ceph osd pool set test size 2 

# create /tmp/cmap2 (maps to [0,1]) and /tmp/cmap3 (maps to [0,2])
sudo ceph osd getcrushmap -o /tmp/cmap
crushtool -d /tmp/cmap -o /tmp/cmap.crush
vim /tmp/cmap.crush
crushtool -c /tmp/cmap.crush -o /tmp/cmap2
vim /tmp/cmap.crush
crushtool -c /tmp/cmap.crush -o /tmp/cmap2
vim /tmp/cmap.crush
crushtool -c /tmp/cmap.crush -o /tmp/cmap3
sudo ceph osd setcrushmap -i /tmp/cmap2

# populate pool
sudo rados -p test bench 10 write

# at this point, in another window, start a gdb session with 
# set pagination off
# set target-async on
# set non-stop on
# in your .gdbinit
# and attach to the osd.0 process and create a breakpoint at
# the end of PG.cc::activate().  You should do ceph-osd -v
# and use git to checkout that exact version to get the
# correct line number.  
#
# You can set the breakpoint with
# break PG.cc:<line_number>
# 
# and then resume the process with
# continue -a

# kill osd 1
ps aux | grep ceph-osd
sudo kill -9 10083
sudo ceph osd down 1 
sudo ceph -s

# remap to the new mapping
sudo ceph osd setcrushmap -i /tmp/cmap3

# wait for osd.0 to trigger the break point, then wait 10s longer
sudo kill -9 7188
sudo ceph osd down 0

# wait for the pg to go down now that 0,1 are both dead
sudo ceph -s
sudo ceph -s
sudo ceph -s

# restart osd 0
sudo ceph-osd -i 0

# wait to go clean
sudo ceph -w

If you run 

ubuntu@test1:/var/log/ceph$ grep 'pg\[1.0(.*calc_acting' ceph-osd.0.log

you can confirm that the situation was created correctly if you see something 
like towards the end:

--- interval 1
2015-07-23 00:10:05.015982 7fa814977700 10 osd.0 pg_epoch: 37 pg[1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/37/37) [0,2] r=0 lpr=37 pi=29-36/4 crt=11'547 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/37/37)
2015-07-23 00:10:05.016026 7fa814977700 10 osd.0 pg_epoch: 37 pg[1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/37/37) [0,2] r=0 lpr=37 pi=29-36/4 crt=11'547 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 1.0( v 11'550 (11'545,11'550] lb MIN local-les=33 n=0 ec=7 les/c 30/30 37/37/37)
2015-07-23 00:10:05.016079 7fa814977700 10 osd.0 pg_epoch: 37 pg[1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/37/37) [0,2] r=0 lpr=37 pi=29-36/4 crt=11'547 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/37/37)
--- interval 2
2015-07-23 00:10:06.019639 7fa814977700 10 osd.0 pg_epoch: 38 pg[1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/38/37) [0,2]/[0] r=0 lpr=38 pi=29-37/5 crt=11'547 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting osd.0 1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/38/37)
2015-07-23 00:10:06.019648 7fa814977700 10 osd.0 pg_epoch: 38 pg[1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/38/37) [0,2]/[0] r=0 lpr=38 pi=29-37/5 crt=11'547 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting osd.2 1.0( v 11'550 (11'545,11'550] lb MIN local-les=33 n=0 ec=7 les/c 30/30 37/38/37)
2015-07-23 00:10:06.019670 7fa814977700 10 osd.0 pg_epoch: 38 pg[1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/38/37) [0,2]/[0] r=0 lpr=38 pi=29-37/5 crt=11'547 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting newest update on osd.0 with 1.0( v 11'550 (11'480,11'550] local-les=30 n=0 ec=7 les/c 30/30 37/38/37)
2015-07-23 00:10:07.045110 7fa814176700 10 osd.0 pg_epoch: 39 pg[1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/30 37/38/37) [0,2]/[0] r=0 lpr=38 pi=29-37/5 bft=2 crt=11'547 lcod 0'0 mlcod 0'0 active+undersized+remapped] calc_acting osd.0 1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/30 37/38/37)
2015-07-23 00:10:07.045141 7fa814176700 10 osd.0 pg_epoch: 39 pg[1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/30 37/38/37) [0,2]/[0] r=0 lpr=38 pi=29-37/5 bft=2 crt=11'547 lcod 0'0 mlcod 0'0 active+undersized+remapped] calc_acting osd.2 1.0( v 11'550 (11'545,11'550] local-les=39 n=0 ec=7 les/c 39/30 37/38/37)
2015-07-23 00:10:07.045167 7fa814176700 10 osd.0 pg_epoch: 39 pg[1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/30 37/38/37) [0,2]/[0] r=0 lpr=38 pi=29-37/5 bft=2 crt=11'547 lcod 0'0 mlcod 0'0 active+undersized+remapped] calc_acting newest update on osd.0 with 1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/30 37/38/37)
--- interval 3
2015-07-23 00:10:08.048369 7fa814176700 10 osd.0 pg_epoch: 40 pg[1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/39 37/40/37) [0,2] r=0 lpr=40 pi=38-39/1 crt=11'547 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/39 37/40/37)
2015-07-23 00:10:08.048382 7fa814176700 10 osd.0 pg_epoch: 40 pg[1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/39 37/40/37) [0,2] r=0 lpr=40 pi=38-39/1 crt=11'547 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 1.0( v 11'550 (11'545,11'550] local-les=33 n=0 ec=7 les/c 39/39 37/40/37)
2015-07-23 00:10:08.048405 7fa814176700 10 osd.0 pg_epoch: 40 pg[1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/39 37/40/37) [0,2] r=0 lpr=40 pi=38-39/1 crt=11'547 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 1.0( v 11'550 (11'480,11'550] local-les=39 n=0 ec=7 les/c 39/39 37/40/37)

interval 1 is the first interval after we restarted osd.0.  Note that local-les
for osd.2 is 33 while it is 30 for osd.0.  If you are running a version without
interval 1 is the first interval after we restarted osd.0.  Note that local-les
master where it is fixed, so we see interval 2 with a pg_temp of [0] to allow
backfill (two calls to calc_acting in that one), then a third interval after
backfill has completed.  After that, it goes active+clean later in the log.

Comment 14 Samuel Just 2015-07-23 00:29:48 UTC

Oops, last paragraph was not pasted correctly.

interval 1 is the first interval after we restarted osd.0.  Note that local-les
for osd.2 is 33 while it is 30 for osd.0.  If you are running a version without
the fix, peering will then fail and the pg will go down.  This output was from
master where it is fixed, so we see interval 2 with a pg_temp of [0] to allow
backfill (two calls to calc_acting in that one), then a third interval after
backfill has completed.  After that, it goes active+clean later in the log.

Comment 15 Ken Dreyer (Red Hat) 2015-08-04 21:16:37 UTC

Commit to backport: https://github.com/ceph/ceph/pull/5269/commits

Comment 18 Vikhyat Umrao 2015-10-07 15:24:02 UTC

*** Bug 1267129 has been marked as a duplicate of this bug. ***

Comment 19 shylesh 2015-11-02 07:04:12 UTC

Hi Sam,

I followed comment 13 and verified this bug.

here is the output from osd logs on peering
============================================

2015-10-31 13:37:20.565331 7f1d0654b700 10 osd.0 pg_epoch: 465 pg[4.0( v 444'856 (0'0,444'856] local-les=461 n=0 ec=361 les/c 464/464 465/465/465) [0,2] r=0 lpr=465 pi=460-464/2 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 444'856 (0'0,444'856] local-les=461 n=0 ec=361 les/c 464/464 465/465/465)
2015-10-31 13:37:20.565356 7f1d0654b700 10 osd.0 pg_epoch: 465 pg[4.0( v 444'856 (0'0,444'856] local-les=461 n=0 ec=361 les/c 464/464 465/465/465) [0,2] r=0 lpr=465 pi=460-464/2 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 4.0( v 444'856 (0'0,444'856] local-les=464 n=0 ec=361 les/c 464/464 465/465/465)
2015-10-31 13:37:20.565397 7f1d0654b700 10 osd.0 pg_epoch: 465 pg[4.0( v 444'856 (0'0,444'856] local-les=461 n=0 ec=361 les/c 464/464 465/465/465) [0,2] r=0 lpr=465 pi=460-464/2 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.2 with 4.0( v 444'856 (0'0,444'856] local-les=464 n=0 ec=361 les/c 464/464 465/465/465)
2015-10-31 13:37:43.645127 7f1d0654b700 10 osd.0 pg_epoch: 467 pg[4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465) [0,1] r=0 lpr=467 pi=465-466/1 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465)
2015-10-31 13:37:43.645145 7f1d0654b700 10 osd.0 pg_epoch: 467 pg[4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465) [0,1] r=0 lpr=467 pi=465-466/1 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting osd.1 4.0( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0)
2015-10-31 13:37:43.645159 7f1d0654b700 10 osd.0 pg_epoch: 467 pg[4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465) [0,1] r=0 lpr=467 pi=465-466/1 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465)
2015-10-31 13:37:43.645194 7f1d0654b700 10 osd.0 pg_epoch: 467 pg[4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465) [0,1] r=0 lpr=467 pi=465-466/1 crt=444'853 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 444'856 (0'0,444'856] local-les=466 n=0 ec=361 les/c 466/466 467/467/465)
2015-10-31 13:42:27.381326 7f1d06d4c700 10 osd.0 pg_epoch: 484 pg[4.0( v 444'856 (0'0,444'856] local-les=468 n=0 ec=361 les/c 479/479 482/482/482) [0,2] r=0 lpr=484 pi=467-481/10 crt=444'853 lcod 0'0 mlcod 0'0 down+peering] calc_acting osd.0 4.0( v 444'856 (0'0,444'856] local-les=468 n=0 ec=361 les/c 479/479 482/482/482)
2015-10-31 13:42:27.381343 7f1d06d4c700 10 osd.0 pg_epoch: 484 pg[4.0( v 444'856 (0'0,444'856] local-les=468 n=0 ec=361 les/c 479/479 482/482/482) [0,2] r=0 lpr=484 pi=467-481/10 crt=444'853 lcod 0'0 mlcod 0'0 down+peering] calc_acting osd.1 4.0( v 470'1214 (0'0,470'1214] local-les=479 n=0 ec=361 les/c 479/479 482/482/482)
2015-10-31 13:42:27.381359 7f1d06d4c700 10 osd.0 pg_epoch: 484 pg[4.0( v 444'856 (0'0,444'856] local-les=468 n=0 ec=361 les/c 479/479 482/482/482) [0,2] r=0 lpr=484 pi=467-481/10 crt=444'853 lcod 0'0 mlcod 0'0 down+peering] calc_acting osd.2 4.0( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0)
2015-10-31 13:42:27.381392 7f1d06d4c700 10 osd.0 pg_epoch: 484 pg[4.0( v 444'856 (0'0,444'856] local-les=468 n=0 ec=361 les/c 479/479 482/482/482) [0,2] r=0 lpr=484 pi=467-481/10 crt=444'853 lcod 0'0 mlcod 0'0 down+peering] calc_acting newest update on osd.1 with 4.0( v 470'1214 (0'0,470'1214] local-les=479 n=0 ec=361 les/c 479/479 482/482/482)
2015-10-31 13:42:36.980522 7f1d06d4c700 10 osd.0 pg_epoch: 485 pg[4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482) [0,1] r=0 lpr=485 pi=482-484/1 crt=470'1214 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482)
2015-10-31 13:42:36.980541 7f1d06d4c700 10 osd.0 pg_epoch: 485 pg[4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482) [0,1] r=0 lpr=485 pi=482-484/1 crt=470'1214 lcod 0'0 mlcod 0'0 peering] calc_acting osd.1 4.0( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0)
2015-10-31 13:42:36.980556 7f1d06d4c700 10 osd.0 pg_epoch: 485 pg[4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482) [0,1] r=0 lpr=485 pi=482-484/1 crt=470'1214 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482)
2015-10-31 13:42:36.980590 7f1d06d4c700 10 osd.0 pg_epoch: 485 pg[4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482) [0,1] r=0 lpr=485 pi=482-484/1 crt=470'1214 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 470'1214 (0'0,470'1214] local-les=484 n=0 ec=361 les/c 484/484 485/485/482)
2015-10-31 13:45:20.691977 7f1d0654b700 10 osd.0 pg_epoch: 489 pg[4.0( v 486'1498 (0'0,486'1498] local-les=486 n=0 ec=361 les/c 488/488 489/489/489) [0,1] r=0 lpr=489 pi=485-488/2 crt=486'1487 lcod 486'1497 mlcod 0'0 peering] calc_acting osd.0 4.0( v 486'1498 (0'0,486'1498] local-les=486 n=0 ec=361 les/c 488/488 489/489/489)
2015-10-31 13:45:20.691995 7f1d0654b700 10 osd.0 pg_epoch: 489 pg[4.0( v 486'1498 (0'0,486'1498] local-les=486 n=0 ec=361 les/c 488/488 489/489/489) [0,1] r=0 lpr=489 pi=485-488/2 crt=486'1487 lcod 486'1497 mlcod 0'0 peering] calc_acting osd.1 4.0( v 488'1534 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 489/489/489)
2015-10-31 13:45:20.692029 7f1d0654b700 10 osd.0 pg_epoch: 489 pg[4.0( v 486'1498 (0'0,486'1498] local-les=486 n=0 ec=361 les/c 488/488 489/489/489) [0,1] r=0 lpr=489 pi=485-488/2 crt=486'1487 lcod 486'1497 mlcod 0'0 peering] calc_acting newest update on osd.1 with 4.0( v 488'1534 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 489/489/489)
2015-10-31 13:45:24.269829 7f1e8b4cb700 10 osd.0 pg_epoch: 492 pg[4.0( v 488'1534 lc 486'1498 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 492/492/492) [0,1] r=0 lpr=492 pi=485-491/4 crt=488'1534 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 488'1534 lc 486'1498 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 492/492/492)
2015-10-31 13:45:24.269847 7f1e8b4cb700 10 osd.0 pg_epoch: 492 pg[4.0( v 488'1534 lc 486'1498 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 492/492/492) [0,1] r=0 lpr=492 pi=485-491/4 crt=488'1534 lcod 0'0 mlcod 0'0 peering] calc_acting osd.1 4.0( v 488'1534 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 492/492/492)
2015-10-31 13:45:24.269888 7f1e8b4cb700 10 osd.0 pg_epoch: 492 pg[4.0( v 488'1534 lc 486'1498 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 492/492/492) [0,1] r=0 lpr=492 pi=485-491/4 crt=488'1534 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 488'1534 lc 486'1498 (0'0,488'1534] local-les=488 n=0 ec=361 les/c 488/488 492/492/492)
2015-10-31 13:46:31.218983 7f1e8b4cb700 10 osd.0 pg_epoch: 494 pg[4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 494/494/492) [0] r=0 lpr=494 pi=492-493/1 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 494/494/492)
2015-10-31 13:46:31.219014 7f1e8b4cb700 10 osd.0 pg_epoch: 494 pg[4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 494/494/492) [0] r=0 lpr=494 pi=492-493/1 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 494/494/492)
2015-10-31 13:46:32.304289 7f1e8acca700 10 osd.0 pg_epoch: 495 pg[4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 495/495/492) [0,1] r=0 lpr=495 pi=492-494/2 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 495/495/492)
2015-10-31 13:46:32.304312 7f1e8acca700 10 osd.0 pg_epoch: 495 pg[4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 495/495/492) [0,1] r=0 lpr=495 pi=492-494/2 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.1 4.0( v 493'1930 (0'0,493'1930] local-les=493 n=396 ec=361 les/c 493/493 495/495/492)
2015-10-31 13:46:32.304349 7f1e8acca700 10 osd.0 pg_epoch: 495 pg[4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 495/495/492) [0,1] r=0 lpr=495 pi=492-494/2 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=493 n=411 ec=361 les/c 493/493 495/495/492)
2015-10-31 13:48:38.825532 7f1e8acca700 10 osd.0 pg_epoch: 498 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 498/498/492) [0,1] r=0 lpr=498 pi=492-497/4 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 498/498/492)
2015-10-31 13:48:38.825563 7f1e8acca700 10 osd.0 pg_epoch: 498 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 498/498/492) [0,1] r=0 lpr=498 pi=492-497/4 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.1 4.0( v 493'1930 (0'0,493'1930] local-les=493 n=396 ec=361 les/c 493/493 498/498/492)
2015-10-31 13:48:38.825620 7f1e8acca700 10 osd.0 pg_epoch: 498 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 498/498/492) [0,1] r=0 lpr=498 pi=492-497/4 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 498/498/492)
2015-10-31 13:48:38.828566 7f1e8b4cb700 10 osd.0 pg_epoch: 499 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 499/499/492) [0] r=0 lpr=499 pi=492-498/5 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 499/499/492)
2015-10-31 13:48:38.828595 7f1e8b4cb700 10 osd.0 pg_epoch: 499 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 499/499/492) [0] r=0 lpr=499 pi=492-498/5 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 499/499/492)
2015-10-31 13:49:30.015364 7f1e8acca700 10 osd.0 pg_epoch: 503 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 501/501/492) [0,2] r=0 lpr=503 pi=492-500/7 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 501/501/492)
2015-10-31 13:49:30.015383 7f1e8acca700 10 osd.0 pg_epoch: 503 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 501/501/492) [0,2] r=0 lpr=503 pi=492-500/7 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting osd.2 4.0( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0)
2015-10-31 13:49:30.015427 7f1e8acca700 10 osd.0 pg_epoch: 503 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 501/501/492) [0,2] r=0 lpr=503 pi=492-500/7 crt=493'1928 lcod 493'1928 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 501/501/492)
2015-10-31 13:49:58.684334 7f11b1156700 10 osd.0 pg_epoch: 505 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 505/505/505) [0,2] r=0 lpr=505 pi=492-504/9 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 505/505/505)
2015-10-31 13:49:58.684356 7f11b1156700 10 osd.0 pg_epoch: 505 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 505/505/505) [0,2] r=0 lpr=505 pi=492-504/9 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 4.0( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0)
2015-10-31 13:49:58.684396 7f11b1156700 10 osd.0 pg_epoch: 505 pg[4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 505/505/505) [0,2] r=0 lpr=505 pi=492-504/9 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=496 n=411 ec=361 les/c 493/493 505/505/505)
2015-10-31 13:50:06.108160 7fa00a7f3700 10 osd.0 pg_epoch: 508 pg[4.0( v 493'1945 (0'0,493'1945] local-les=506 n=411 ec=361 les/c 506/493 508/508/508) [0,2] r=0 lpr=508 pi=492-507/11 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=506 n=411 ec=361 les/c 506/493 508/508/508)
2015-10-31 13:50:06.108183 7fa00a7f3700 10 osd.0 pg_epoch: 508 pg[4.0( v 493'1945 (0'0,493'1945] local-les=506 n=411 ec=361 les/c 506/493 508/508/508) [0,2] r=0 lpr=508 pi=492-507/11 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 4.0( v 493'1945 lc 493'1615 (0'0,493'1945] local-les=506 n=411 ec=361 les/c 506/493 508/508/508)
2015-10-31 13:50:06.108226 7fa00a7f3700 10 osd.0 pg_epoch: 508 pg[4.0( v 493'1945 (0'0,493'1945] local-les=506 n=411 ec=361 les/c 506/493 508/508/508) [0,2] r=0 lpr=508 pi=492-507/11 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 4.0( v 493'1945 (0'0,493'1945] local-les=506 n=411 ec=361 les/c 506/493 508/508/508)
2015-10-31 13:51:39.126852 7fd7bc58c700 10 osd.0 pg_epoch: 512 pg[4.0( v 493'1945 (0'0,493'1945] local-les=509 n=411 ec=361 les/c 511/493 512/512/512) [0,2] r=0 lpr=512 pi=492-511/13 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting osd.0 4.0( v 493'1945 (0'0,493'1945] local-les=509 n=411 ec=361 les/c 511/493 512/512/512)
2015-10-31 13:51:39.126871 7fd7bc58c700 10 osd.0 pg_epoch: 512 pg[4.0( v 493'1945 (0'0,493'1945] local-les=509 n=411 ec=361 les/c 511/493 512/512/512) [0,2] r=0 lpr=512 pi=492-511/13 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 4.0( v 493'1945 lc 493'1677 (0'0,493'1945] local-les=511 n=411 ec=361 les/c 511/493 512/512/512)
2015-10-31 13:51:39.126912 7fd7bc58c700 10 osd.0 pg_epoch: 512 pg[4.0( v 493'1945 (0'0,493'1945] local-les=509 n=411 ec=361 les/c 511/493 512/512/512) [0,2] r=0 lpr=512 pi=492-511/13 crt=493'1928 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.2 with 4.0( v 493'1945 lc 493'1677 (0'0,493'1945] local-les=511 n=411 ec=361 les/c 511/493 512/512/512)



If you can confirm that this is the expected output then I will move this bug to verified.

Comment 20 Samuel Just 2015-11-02 16:00:39 UTC

Can you just attach the log from osd.0?

Comment 21 Samuel Just 2015-11-02 16:26:44 UTC

Ok, you need to do it again with

osd_min_pg_log_entries set to 10 and
osd_max_pg_log_entries set to 20

in ceph.conf with a new cluster before you start.  Sorry, I missed that in the notes.

Comment 22 shylesh 2015-11-02 16:27:09 UTC

Created attachment 1088629 [details]
osd.0 logs

Comment 23 shylesh 2015-11-02 17:06:53 UTC

Created attachment 1088647 [details]
osd.0 logs with osd_min_pg_log_entries set to 10 and osd_max_pg_log_entries set to 20

Comment 24 shylesh 2015-11-02 17:12:05 UTC

Sam I have attached the second log file ran with osd_min_pg_log_entries set to 10 and osd_max_pg_log_entries set to 20. 
Please run grep 'pg\[5.0(.*calc_acting' ceph-osd.0.log

Comment 25 Samuel Just 2015-11-02 19:00:48 UTC

It looks like osd.2 finished backfill, which suggests that you didn't create the breakpoint correctly on osd.0.  At what line did you place the breakpoint?

Comment 26 shylesh 2015-11-03 06:13:21 UTC

I put the break point at PG.cc:1459


1459 void PG::activate(ObjectStore::Transaction& t,
1460                   epoch_t activation_epoch,



ubuntu@magna109:~/ceph$ git branch
* (detached from v0.94.3.2)
  master

Comment 27 shylesh 2015-11-03 07:27:30 UTC

Created attachment 1088816 [details]
osd.0 logs latest

Comment 28 shylesh 2015-11-03 07:28:32 UTC

Hi Sam, 

I have attached the new logs for osd.0 
please use grep 'pg\[7.0(.*calc_acting' ceph-osd.0.log

Comment 29 shylesh 2015-11-03 19:44:00 UTC

Verified this bug with the help of Sam.

1. Extra parameters to be added in ceph.conf
   osd_min_pg_log_entries = 10
  osd_max_pg_log_entries = 20

2. There has to be exactly only one PG in the pool.

Comment 31 errata-xmlrpc 2015-11-23 20:21:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2512

Comment 32 Siddharth Sharma 2015-11-23 21:53:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2066

Note You need to log in before you can comment on or make changes to this bug.