Bug 1262985

Summary: Backport mon: add a cache layer over MonitorDBStore #5524
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Neil Levine <nlevine>
Component: RADOSAssignee: Ken Dreyer (Red Hat) <kdreyer>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.2.3CC: ceph-eng-bugs, ceph-qe-bugs, dzafman, flucifre, hnallurv, kchai, kdreyer, ksquizza, nlevine, vakulkar
Target Milestone: rc   
Target Release: 1.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-0.94.1-19.el7cp (RHEL) ceph v0.94.1.8 (Ubuntu) Doc Type: Bug Fix
Doc Text:
Prior to this update, poor LevelDB performance in Ceph's monitors could cause spurious elections. This could lead to slow requests during re-balancing. With this update, Ceph now caches the osdmap to be sent to the monitor clients, and cluster performance is improved.
Story Points: ---
Clone Of: 1262460 Environment:
Last Closed: 2015-10-08 18:59:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 2 Ken Dreyer (Red Hat) 2015-09-17 20:25:10 UTC
We are fixing this in 1.2.3.2 on Ubuntu (bz 1262460), so it needs to be in the 1.3.0 GA Ubuntu build to avoid regressions for customers.

Upstream's hammer change was https://github.com/ceph/ceph/pull/5697

Comment 3 Neil Levine 2015-09-17 20:28:08 UTC
Good spot. Are you suggesting we apply this as an async on 1.3?

Comment 4 Ken Dreyer (Red Hat) 2015-09-17 21:01:06 UTC
Right, we'll put it into the 1.3.0 Ubuntu GA release, and then fix it as a 1.3.0 RHEL ASYNC update.

Comment 6 Ken Dreyer (Red Hat) 2015-09-24 17:03:58 UTC
Harish wrote the following in email today:

> The plan to verify this bug is as follows:
> 
> 1) On a build that is not having this fix, run the monstore-tool with leveldb
> cache size set as 10 with IOs running from clients.
>     Expected: IO failures
>
> 2) On a build that is Having this fix, run the monstore-tool with leveldb
> cache size set as 10 with IOs running from clients.
>    Expected: NO IO failures
>
> We need ceph-monstore-tool to run in both the cases above. 

Kefu, what exact versions of ceph-monstore-tool does QE need? In case #1 above, do we need to use ceph-test from v0.80.7? Or am I confusing things?

Comment 9 Kefu Chai 2015-09-29 07:46:44 UTC
for the usage of ceph-monstore-tool, see https://bugzilla.redhat.com/show_bug.cgi?id=1263608

Comment 12 errata-xmlrpc 2015-10-08 18:59:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:1882