Back to bug 2040528

Who When What Removed Added
Vikhyat Umrao 2022-01-14 00:06:00 UTC Link ID Github ceph/ceph/pull/44015
Vikhyat Umrao 2022-01-14 00:06:30 UTC Link ID Ceph Project Bug Tracker 53876
Red Hat One Jira (issues.redhat.com) 2022-01-14 00:09:34 UTC Link ID Red Hat Issue Tracker RHCEPH-2969
Vikhyat Umrao 2022-01-14 00:14:08 UTC Link ID Github ceph/ceph/pull/44584
Vikhyat Umrao 2022-01-14 00:19:45 UTC CC tserlin
Flags needinfo?(tserlin)
Vikhyat Umrao 2022-01-20 00:44:09 UTC Flags needinfo?(tserlin)
Severity medium high
Status NEW ASSIGNED
Target Release 5.2 5.1
Vikhyat Umrao 2022-01-20 00:47:29 UTC Flags needinfo?(tserlin)
Vikhyat Umrao 2022-01-20 00:49:06 UTC Doc Type If docs needed, set a value Bug Fix
Veera Raghava Reddy 2022-01-26 16:46:10 UTC Flags needinfo?(tserlin)
CC vereddy
Vikhyat Umrao 2022-01-26 16:56:37 UTC Status ASSIGNED POST
Vikhyat Umrao 2022-01-26 16:57:39 UTC Flags needinfo?(tserlin)
Vikhyat Umrao 2022-01-26 18:21:59 UTC Flags needinfo?(vumrao)
Flags needinfo?(vumrao) needinfo?(tserlin)
errata-xmlrpc 2022-01-27 03:16:12 UTC Flags needinfo?(tserlin)
Flags needinfo?(tserlin)
Status POST MODIFIED
Fixed In Version ceph-16.2.7-37.el8cp
Status MODIFIED ON_QA
Aron Gunn 2022-01-27 20:23:45 UTC CC agunn
Doc Type Bug Fix If docs needed, set a value
Pawan 2022-01-28 12:22:35 UTC Flags needinfo?(vumrao)
Vikhyat Umrao 2022-01-28 18:27:32 UTC CC stephen.blinick
Vikhyat Umrao 2022-01-28 21:52:54 UTC Flags needinfo?(vumrao)
Pawan 2022-01-29 07:49:41 UTC Status ON_QA VERIFIED
Orit Wasserman 2022-02-08 14:35:35 UTC CC owasserm
Vikhyat Umrao 2022-02-28 19:06:46 UTC Blocks 2059329
Ranjini M N 2022-03-29 15:04:55 UTC Doc Type If docs needed, set a value Bug Fix
Flags needinfo?(nojha)
CC rmandyam
Neha Ojha 2022-03-31 23:09:35 UTC Flags needinfo?(nojha)
Doc Text Cause: In the existing read lease implementation[0], the primary OSD clears history prior_readable_until_ub (upper bound on how long prior interval is readable) early in the peering stage and this information is also propagated to its peers.

Consequence: In circumstances, when the primary OSD restarts, knowledge about prior interval in no longer available, which may make PGs go into WAIT state (a state where the PG is waiting for prior intervals' readable period to expire). Any OSD requests will block during that period.

Fix: Clear prior_readable_until_ub at the end of the peering cycle, just before activating, after we have already talked to the peer OSDs and they know that the prior interval has finished.

Result: PGs no longer go into WAIT state after OSD restarts.

[0] https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/
Ranjini M N 2022-04-01 12:13:03 UTC Flags needinfo?(nojha)
Doc Text Cause: In the existing read lease implementation[0], the primary OSD clears history prior_readable_until_ub (upper bound on how long prior interval is readable) early in the peering stage and this information is also propagated to its peers.

Consequence: In circumstances, when the primary OSD restarts, knowledge about prior interval in no longer available, which may make PGs go into WAIT state (a state where the PG is waiting for prior intervals' readable period to expire). Any OSD requests will block during that period.

Fix: Clear prior_readable_until_ub at the end of the peering cycle, just before activating, after we have already talked to the peer OSDs and they know that the prior interval has finished.

Result: PGs no longer go into WAIT state after OSD restarts.

[0] https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/
.The `prrio_readable_until_ub` parameter is cleared at the end of the peering cycle

Previously, under circumstances when the primary OSD restarted, the knowledge about the prior interval was unavailable as the `prior_readable_until_ub` parameter was peered early in the peering stage which was propagated to its peers. This would cause the placement groups (PGs) to go into a WAIT state and this would block any OSD request during that period.

With this release, the `prior_readable_until_ub` parameter is cleared at the end of the peering cycle, just before activating, after communicating to the peer OSDs and the PGs no longer go into WAIT state after the OSD is restarted.
Blocks 2031073
Neha Ojha 2022-04-01 17:25:00 UTC Flags needinfo?(nojha)
Ranjini M N 2022-04-04 06:33:09 UTC Doc Text .The `prrio_readable_until_ub` parameter is cleared at the end of the peering cycle

Previously, under circumstances when the primary OSD restarted, the knowledge about the prior interval was unavailable as the `prior_readable_until_ub` parameter was peered early in the peering stage which was propagated to its peers. This would cause the placement groups (PGs) to go into a WAIT state and this would block any OSD request during that period.

With this release, the `prior_readable_until_ub` parameter is cleared at the end of the peering cycle, just before activating, after communicating to the peer OSDs and the PGs no longer go into WAIT state after the OSD is restarted.
.The `prior_readable_until_ub` parameter is cleared at the end of the peering cycle

Previously, under circumstances when the primary OSD restarted, the knowledge about the prior interval was unavailable as the `prior_readable_until_ub` parameter was cleared early in the peering stage which was propagated to its peers. This would cause the placement groups (PGs) to go into a WAIT state and this would block any OSD request during that period.

With this release, the `prior_readable_until_ub` parameter is cleared at the end of the peering cycle, just before activating, after communicating to the peer OSDs and the PGs no longer go into WAIT state after the OSD is restarted.
Ranjini M N 2022-04-04 06:36:27 UTC Doc Text .The `prior_readable_until_ub` parameter is cleared at the end of the peering cycle

Previously, under circumstances when the primary OSD restarted, the knowledge about the prior interval was unavailable as the `prior_readable_until_ub` parameter was cleared early in the peering stage which was propagated to its peers. This would cause the placement groups (PGs) to go into a WAIT state and this would block any OSD request during that period.

With this release, the `prior_readable_until_ub` parameter is cleared at the end of the peering cycle, just before activating, after communicating to the peer OSDs and the PGs no longer go into WAIT state after the OSD is restarted.
.The `prior_readable_until_ub` parameter is cleared at the end of the peering cycle

Previously, under circumstances when the primary OSD restarted, the knowledge about the prior interval was unavailable as the `prior_readable_until_ub` parameter, which stands for the upper bound on how long prior interval is readable for a PG, was cleared early in the peering stage which was propagated to its peers.
This would cause the placement groups (PGs) to go into a WAIT state and this would block any OSD request during that period.

With this release, the `prior_readable_until_ub` parameter is cleared at the end of the peering cycle, just before activating, after communicating to the peer OSDs and the PGs no longer go into WAIT state after the OSD is restarted.
errata-xmlrpc 2022-04-04 08:04:31 UTC Status VERIFIED RELEASE_PENDING
errata-xmlrpc 2022-04-04 10:23:35 UTC Resolution --- ERRATA
Status RELEASE_PENDING CLOSED
Last Closed 2022-04-04 10:23:35 UTC
errata-xmlrpc 2022-04-04 10:24:02 UTC Link ID Red Hat Product Errata RHSA-2022:1174

Back to bug 2040528