Bug 1471939 - pre-jewel "osd rm" incrementals are misinterpreted
Summary: pre-jewel "osd rm" incrementals are misinterpreted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 1.3.3
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: 2.4
Assignee: Josh Durgin
QA Contact: ceph-qe-bugs
Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks: 1473436 1479701
TreeView+ depends on / blocked
 
Reported: 2017-07-17 17:47 UTC by Vikhyat Umrao
Modified: 2021-09-09 12:28 UTC (History)
8 users (show)

Fixed In Version: RHEL: ceph-2:10.2.7-39.el7cp Ubuntu: ceph_10.2.7-38redhat1
Doc Type: Bug Fix
Doc Text:
.CRUSH calculations for removed OSDs match on kernel clients and the cluster When an OSD was removed with the `ceph osd rm` command, but was still present in the CRUSH map, the CRUSH calculations for that OSD on kernel clients and the cluster did not match. Consequently, kernel clients returned I/O errors. The mismatch between client and server behavior has been fixed and kernel clients do not return the I/O errors anymore in this situation.
Clone Of:
Environment:
Last Closed: 2017-10-17 18:12:51 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 19119 0 None None None 2017-07-17 18:23:18 UTC
Ceph Project Bug Tracker 19210 0 None None None 2017-07-17 18:22:56 UTC
Red Hat Issue Tracker RHCEPH-1525 0 None None None 2021-09-09 12:28:35 UTC
Red Hat Product Errata RHBA-2017:2903 0 normal SHIPPED_LIVE Red Hat Ceph Storage 2.4 enhancement and bug fix update 2017-10-17 22:12:30 UTC

Description Vikhyat Umrao 2017-07-17 17:47:46 UTC
Description of problem:
pre-jewel "osd rm" incrementals are misinterpreted
http://tracker.ceph.com/issues/19119

Upstream PR: https://github.com/ceph/ceph/pull/13730
Release notes:https://github.com/ceph/ceph/pull/13731/files


* There was a bug introduced in Jewel (#19119) that broke the mapping behavior
  when an "out" OSD that still existed in the CRUSH map was removed with 'osd rm'. This could result in 'misdirected op' and other errors.  The bug is now fixed, but the fix itself introduces the same risk because the behavior may vary between clients and OSDs.  

To avoid problems, please ensure that all OSDs are removed from the CRUSH map before deleting them.  That is, be sure to do::

   ceph osd crush rm osd.123

before::

   ceph osd rm osd.123


We have a Kernel RBD RHEL 7.4 bug https://bugzilla.redhat.com/show_bug.cgi?id=1427556 which depends on this bug.


Version-Release number of selected component (if applicable):
Red Hat Ceph Storage 1.3.3

Comment 4 Ian Colle 2017-08-16 21:49:36 UTC
Moving to 2.5 until 2.4 proper is released.

Comment 14 Josh Durgin 2017-10-12 17:46:04 UTC
Looks good, thanks Bara!

Comment 16 errata-xmlrpc 2017-10-17 18:12:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2903


Note You need to log in before you can comment on or make changes to this bug.