Bug 1505365 - compat osdmap encoding does not reencode crush map
Summary: compat osdmap encoding does not reencode crush map
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 3.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: 3.0
Assignee: Sage Weil
QA Contact: shylesh
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-23 12:23 UTC by Sage Weil
Modified: 2018-01-08 16:07 UTC (History)
10 users (show)

Fixed In Version: RHEL: ceph-12.2.1-25.el7cp Ubuntu: ceph_12.2.1-28redhat1xenial
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-05 23:49:04 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 21882 0 None None None 2017-10-23 17:26:33 UTC
Red Hat Product Errata RHBA-2017:3387 0 normal SHIPPED_LIVE Red Hat Ceph Storage 3.0 bug fix and enhancement update 2017-12-06 03:03:45 UTC

Description Sage Weil 2017-10-23 12:23:11 UTC
Description of problem:

When sending an MOSDMap message to clients with an older OSDMap, we do not reencode the CRUSH map in the incremental map.  This is imprecise at best, and at worst will lead to incorrect weights if a weight-set is in use (specifically a compat weight-set) such that client IOs can stall.

How reproducible:

100%

Steps to Reproduce:
1. mount from non-mimic client (rbd, cephfs, whatever)
2. ceph osd crush weight-set create-compat
3. ceph osd crush weight-set reweight-compat osd.0 .5
4. do a bunch of IO, and some IOs will stall (from clients perspective; OSD will not report them stalled and ceph -s will not show slow requests).

Comment 5 Christina Meno 2017-10-23 15:35:21 UTC
This is a blocker because it could cause "Data service unavailability when following a prescribed methodology in the documentation
"

https://mojo.redhat.com/docs/DOC-1146159

Comment 7 Christina Meno 2017-10-23 15:38:06 UTC
Sage would you please link the upstream tracker / PR here ?
Once we have that we can seek acks and potentially move to modified

Comment 15 errata-xmlrpc 2017-12-05 23:49:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387


Note You need to log in before you can comment on or make changes to this bug.