Bug 1363807

Summary: Upgrade crashes OSD in void FileStore::init_temp_collections()
Product: Red Hat Ceph Storage Reporter: David Zafman <dzafman>
Component: RADOSAssignee: David Zafman <dzafman>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.0CC: ceph-eng-bugs, ceph-qe-bugs, dzafman, hnallurv, kchai, kdreyer, rgowdege, tserlin
Target Milestone: rc   
Target Release: 2.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: RHEL: ceph-10.2.2-35.el7cp Ubuntu: ceph_10.2.2-27redhat1xenial Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:45:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description David Zafman 2016-08-03 15:39:27 UTC
Description of problem:

When OSDs restart after upgrade from 1.3.x to 2.0 they can crash almost immediately.  This could happen any number of times, but eventually the OSD will start normally.

How reproducible:

It should happen upon upgrade, but I haven't determined why we aren't seeing this in our testing.

Comment 5 David Zafman 2016-08-05 15:53:04 UTC
To test the fix:

1. Bring up a Jewel cluster using filestore
2. Create a pool with size 1
3. Write a bunch of data to a pool
4. Stop the cluster
5. Find a */current osd dir which doesn't include the a pg_head for a valid PGs
6. Create pg_TEMP for that pg (e.g. mkdir ....current/1.5_TEMP
7. Find any other OSD current directory with a pg_head without a pg_TEMP
8. Rename the pg_head to pg_TEMP (e.g. mv ....current/1.0_head ...current/1.0_TEMP
9. Start cluster

The 2 OSDs that were manipulated should NOT crash

Comment 6 David Zafman 2016-08-05 18:09:32 UTC
We now fully understand the bug in Hammer which triggers this bug in later releases.

An OSD must meet the following criteria:

1. An OSD has been marked out then back in or gets pg(s) pushed to it due to a different map change.  This creates some pg_TEMP dirs
2. The pg_num/pgp_num is increased causing pgs to split including one with pg_TEMP
3. The OSD is NOT restarted prior to upgrade

1. Restart all ODSs before installing upgrade
2. With old OSD stopped and before starting an upgraded OSDs manually search for and rmdir all pg_TEMP directories without a corresponding pg_head directory (scripting this would be helpful)

I would say that this isn't a blocker.  The upstream fix is ready to merge having passed Rados suite testing.  To avoid customers running in to this I advise including it in 2.0 if it isn't too late.

Comment 7 David Zafman 2016-08-05 19:52:37 UTC
Fix merged to Upstream Jewel branch: https://github.com/ceph/ceph/pull/10561

Comment 17 rakesh 2016-08-10 12:15:12 UTC
accidentally moved to verified, changing it back to ON_QA

Comment 18 rakesh 2016-08-11 14:17:35 UTC
I followed the steps as mentioned in the comment 5. And after restarting the modified osds, they did not crash. here are the steps that I followed. 

1. I selected a PG 5.0 which exists in osds [3,1,2]. so created a pg_head in osd.8 where it does not exits ( /current/5.0_head )

2. then I selected another osd, osd.7. I searched for a pg without pg_temp . there were none, all had head associated with temp

3. so I selected pg 6.0_head and deleted it's 6.0_TEMP in that osd. And renamed 6.0_HEAD to 6.0_TEMP. pg 6.0_TEMP was empty before deleting the original dir. 

4. I restarted the cluster and did some IOs and none of the osds crashed.. 

I am moving this bug to verified stage

Comment 20 errata-xmlrpc 2016-08-23 19:45:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.