Back to bug 2042417

Who When What Removed Added
Red Hat One Jira (issues.redhat.com) 2022-01-19 13:51:50 UTC Link ID Red Hat Issue Tracker RHCEPH-3001
Greg Farnum 2022-01-19 20:36:12 UTC Status NEW POST
Vikhyat Umrao 2022-01-20 00:52:48 UTC Flags needinfo?(gfarnum)
Veera Raghava Reddy 2022-01-20 21:04:43 UTC CC vereddy
Greg Farnum 2022-01-21 15:43:40 UTC Flags needinfo?(gfarnum)
Ken Dreyer (Red Hat) 2022-05-24 20:47:27 UTC CC skanta
Keywords Rebase
Status POST MODIFIED
Link ID Github ceph/ceph/pull/44664
CC kdreyer
Fixed In Version ceph-16.2.8-2.el8cp
errata-xmlrpc 2022-05-24 23:37:52 UTC Status MODIFIED ON_QA
Red Hat Bugzilla 2022-05-26 08:30:10 UTC CC ceph-qe-bugs
Pawan 2022-05-31 02:55:15 UTC Status ON_QA VERIFIED
Akash Raj 2022-07-28 10:31:40 UTC Blocks 2102272
Akash Raj 2022-07-28 10:33:15 UTC Blocks 2102272
CC akraj
Flags needinfo?(gfarnum)
Akash Raj 2022-07-28 10:33:58 UTC Blocks 2102272
Greg Farnum 2022-07-28 14:41:38 UTC Doc Type If docs needed, set a value Bug Fix
Flags needinfo?(gfarnum)
Doc Text Cause: Due to a logic error, when operating a cluster in stretch mode it was possible for some PGs to get permanently stuck remapped+peering under certain cluster conditions.

Consequence: Data was unavailable until OSDs were taken offline.

Fix: A logic error resulted in a peering PG inadvertently always choosing OSDs that are *not* in the acting set to be the primary OSD, resulting in selected OSDs pushing responsibility back and forth indefinitely. This logic error has been fixed so that PGs correctly prefer stable OSD sets.

Result: PGs no longer get incorrectly stuck remapped+peering in stretch mode.
Akash Raj 2022-08-03 14:29:46 UTC Doc Text Cause: Due to a logic error, when operating a cluster in stretch mode it was possible for some PGs to get permanently stuck remapped+peering under certain cluster conditions.

Consequence: Data was unavailable until OSDs were taken offline.

Fix: A logic error resulted in a peering PG inadvertently always choosing OSDs that are *not* in the acting set to be the primary OSD, resulting in selected OSDs pushing responsibility back and forth indefinitely. This logic error has been fixed so that PGs correctly prefer stable OSD sets.

Result: PGs no longer get incorrectly stuck remapped+peering in stretch mode.
.PGs no longer get incorrectly stuck in `remapped+peering` state in stretch mode

Previously, due to a logical error, when operating a cluster in stretch mode, it was possible for some placement groups (PGs) to get permanently stuck in `remapped+peering` state under certain cluster conditions, causing the data to be unavailable until the OSDs were taken offline.

With this fix, PGs choose stable OSD sets and they no longer get incorrectly stuck in `remapped+peering` state in stretch mode.
Flags needinfo?(gfarnum)
Docs Contact akraj
errata-xmlrpc 2022-08-09 09:56:24 UTC Status VERIFIED RELEASE_PENDING
errata-xmlrpc 2022-08-09 17:37:27 UTC Status RELEASE_PENDING CLOSED
Resolution --- ERRATA
Last Closed 2022-08-09 17:37:27 UTC
errata-xmlrpc 2022-08-09 17:38:03 UTC Link ID Red Hat Product Errata RHSA-2022:5997
Greg Farnum 2022-08-10 12:46:50 UTC Flags needinfo?(gfarnum) needinfo-

Back to bug 2042417