Bug 1341859 - setting sortbitwise after jewel upgrade can cause unfound objects
Summary: setting sortbitwise after jewel upgrade can cause unfound objects
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RADOS
Version: 2.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 2.0
Assignee: Samuel Just
QA Contact: Vasu Kulkarni
Depends On:
Blocks: 1343229
TreeView+ depends on / blocked
Reported: 2016-06-01 22:17 UTC by Samuel Just
Modified: 2017-07-30 15:15 UTC (History)
7 users (show)

Fixed In Version: ceph-10.2.2-1.el7cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2016-08-23 19:40:38 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 16113 0 None None None 2016-06-01 22:17:47 UTC
Red Hat Product Errata RHBA-2016:1755 0 normal SHIPPED_LIVE Red Hat Ceph Storage 2.0 bug fix and enhancement update 2016-08-23 23:23:52 UTC

Description Samuel Just 2016-06-01 22:17:47 UTC
Description of problem:

A check in PG.cc relies on checking last_backfill == hobject_t::get_max().  Older versions could produce last_backfill values with the max bit set, but which do not compare equal to get_max().  This can cause recent writes to appear unrecoverable even though they are recoverable.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 2 Samuel Just 2016-06-01 22:18:59 UTC
This should be considered a blocker for release of 2.0.  It's not a blocker for the beta as long as we advise users upgrading a cluster to the beta to *not set sortbitwise*.

Comment 3 Samuel Just 2016-06-01 22:54:39 UTC
I *think* this should be reproducible by:

1) create a 1.3.2 cluster (3 osds, 1 mon should be fine) with a pool with a bunch of pgs (1000) and size=2
2) create a bunch of objects in that pool (rados bench -t 128 -b 1 for a while)
3) upgrade to 2.0 with noout set (we need to avoid backfill during this part)
4) start running rados bench again
5) while rados bench is running, set sortbitwise

Comment 4 Samuel Just 2016-06-01 23:29:41 UTC
This bug also implies that we aren't setting the sortbitwise flag in our upgrade testing.  That suggests that it isn't covered in the docs either.  We need to address those two things.

Comment 5 Samuel Just 2016-06-06 19:35:22 UTC
Merged into master, pending merge into jewel next.


Comment 6 Samuel Just 2016-06-09 22:48:05 UTC
Merged into jewel. https://github.com/ceph/ceph/pull/9427

Comment 7 Samuel Just 2016-06-09 22:48:56 UTC
Oops, merged into jewel: https://github.com/ceph/ceph/pull/9674

Comment 8 Ken Dreyer (Red Hat) 2016-06-14 16:15:46 UTC
We will take this change in as part of the rebase to ceph 10.2.2.

Comment 10 Vasu Kulkarni 2016-08-12 16:22:35 UTC
Verified in ceph version 10.2.2-38.el7cp (119a68752a5671253f9daae3f894a90313a6b8e4)

With high number of pgs and too many objects we have to renable the osd-targets and start them again after the chown'ship of /var/lib/ceph, I am not sure why this is required but I had to do this for one of the osd's node to bring them back again.

the failed case without twice renable: http://pulpito.ceph.redhat.com/vasu-2016-08-11_22:03:24-upgrades-ds-jewel---basic-magna/

Comment 11 Samuel Just 2016-08-12 16:36:05 UTC
2016-08-11T22:32:30.843 INFO:tasks.ceph_deploy:Initiate auto relabel after reboot
2016-08-11T22:32:30.844 INFO:teuthology.orchestra.run.magna007:Running: 'sudo touch /.autorelabel'
2016-08-11T22:32:30.912 INFO:tasks.ceph_deploy:Rebooting node magna007
2016-08-11T22:32:30.913 INFO:teuthology.orchestra.run.magna007:Running: 'sudo reboot'
2016-08-11T22:33:30.919 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'magna007.ceph.redhat.com', 'timeout': 60}
2016-08-11T22:33:48.961 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on
2016-08-11T22:34:18.962 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'magna007.ceph.redhat.com', 'timeout': 60}
2016-08-11T22:35:18.969 DEBUG:teuthology.orchestra.remote:timed out
2016-08-11T22:35:48.971 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'magna007.ceph.redhat.com', 'timeout': 60}
2016-08-11T22:36:48.978 DEBUG:teuthology.orchestra.remote:timed out
2016-08-11T22:37:18.981 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'magna007.ceph.redhat.com', 'timeout': 60}
2016-08-11T22:37:19.318 INFO:teuthology.orchestra.run.magna007:Running: 'true'
2016-08-11T22:38:19.787 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'magna007.ceph.redhat.com', 'timeout': 60}
2016-08-11T22:38:20.102 INFO:teuthology.orchestra.run.magna007:Running: 'true'
2016-08-11T22:38:20.266 INFO:tasks.ceph_deploy:Enable systemd files
2016-08-11T22:38:20.267 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl stop firewalld'
2016-08-11T22:38:20.472 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl enable ceph-mon.target'
2016-08-11T22:38:20.624 INFO:teuthology.orchestra.run.magna007.stderr:Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-mon.target to /usr/lib/systemd/system/ceph-mon.target.
2016-08-11T22:38:20.625 INFO:teuthology.orchestra.run.magna007.stderr:Created symlink from /etc/systemd/system/ceph.target.wants/ceph-mon.target to /usr/lib/systemd/system/ceph-mon.target.
2016-08-11T22:38:20.625 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl enable ceph-osd.target'
2016-08-11T22:38:20.742 INFO:teuthology.orchestra.run.magna007.stderr:Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-osd.target to /usr/lib/systemd/system/ceph-osd.target.
2016-08-11T22:38:20.743 INFO:teuthology.orchestra.run.magna007.stderr:Created symlink from /etc/systemd/system/ceph.target.wants/ceph-osd.target to /usr/lib/systemd/system/ceph-osd.target.
2016-08-11T22:38:20.744 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl enable ceph-radosgw.target'
2016-08-11T22:38:20.855 INFO:teuthology.orchestra.run.magna007.stderr:Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-radosgw.target to /usr/lib/systemd/system/ceph-radosgw.target.
2016-08-11T22:38:20.855 INFO:teuthology.orchestra.run.magna007.stderr:Created symlink from /etc/systemd/system/ceph.target.wants/ceph-radosgw.target to /usr/lib/systemd/system/ceph-radosgw.target.
2016-08-11T22:38:20.856 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl enable ceph.target'
2016-08-11T22:38:20.959 INFO:teuthology.orchestra.run.magna007:Running: 'sudo chown -R ceph:ceph /var/lib/ceph'
2016-08-11T22:51:12.792 INFO:teuthology.orchestra.run.magna007:Running: 'sudo chown -R ceph:ceph /var/log/ceph'
2016-08-11T22:51:17.865 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl stop ceph.target'
2016-08-11T22:51:22.927 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl start ceph.target'
2016-08-11T22:51:27.984 INFO:teuthology.orchestra.run.magna007:Running: 'sudo chown -R ceph:ceph /var/lib/ceph'
2016-08-11T22:51:28.822 INFO:teuthology.orchestra.run.magna007:Running: 'sudo chown -R ceph:ceph /var/log/ceph'
2016-08-11T22:51:33.877 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl stop ceph.target'
2016-08-11T22:51:38.936 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl start ceph.target'
2016-08-11T22:51:43.997 INFO:teuthology.orchestra.run.magna007:Running: 'sudo chown -R ceph:ceph /var/lib/ceph'
2016-08-11T22:51:44.816 INFO:teuthology.orchestra.run.magna007:Running: 'sudo chown -R ceph:ceph /var/log/ceph'
2016-08-11T22:51:49.871 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl stop ceph.target'
2016-08-11T22:51:54.930 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl start ceph.target'
2016-08-11T22:51:59.988 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl start ceph-mon@`hostname`.service'
2016-08-11T22:52:05.066 INFO:teuthology.orchestra.run.magna007:Running: 'sudo systemctl status ceph-mon@`hostname`.service'

I don't immediately see code that does this in the upstream ceph-qa-suite.  I think this is a bug in the way you are doing the upgrade.  You need to stop the daemon *before* doing the chown.

Comment 13 errata-xmlrpc 2016-08-23 19:40:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.