Bug 2240839 - [5.3 backport][RADOS] "currently delayed" slow ops does not provide details on why op has been delayed
Summary: [5.3 backport][RADOS] "currently delayed" slow ops does not provide details o...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 5.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 5.3z6
Assignee: Prashant Dhange
QA Contact: Pawan
Ranjini M N
URL:
Whiteboard:
Depends On: 2240832
Blocks: 2240838 2258797
TreeView+ depends on / blocked
 
Reported: 2023-09-26 21:00 UTC by Vikhyat Umrao
Modified: 2024-02-08 16:56 UTC (History)
10 users (show)

Fixed In Version: ceph-16.2.10-238.el8cp
Doc Type: Enhancement
Doc Text:
.New reports available for sub-events for delayed operations Previously, slow operations were marked as delayed but without a detailed description. With this enhancement, you can view the detailed descriptions of delayed sub-events for operations.
Clone Of: 2240832
Environment:
Last Closed: 2024-02-08 16:55:56 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 62996 0 None None None 2023-09-26 23:22:50 UTC
Github ceph ceph pull 53693 0 None open pacific: osd/OpRequest: Add detail description for delayed op in osd log file 2023-09-27 06:32:22 UTC
Red Hat Issue Tracker RHCEPH-7555 0 None None None 2023-09-26 21:02:15 UTC
Red Hat Product Errata RHSA-2024:0745 0 None None None 2024-02-08 16:56:06 UTC

Description Vikhyat Umrao 2023-09-26 21:00:31 UTC
+++ This bug was initially created as a clone of Bug #2240832 +++

Description of problem:
With reference to BZ#2240819, the osd.0 observed slow ops and most of the slow ops were delayed but with no details on why op marked as delayed e.g is it because of "waiting for rw locks" or "waiting for missing objects" or "waiting for peered" etc.

There could be different reason for op being marked as delayed and it could be either of below reason :
  op->mark_delayed("waiting for missing object");
  op->mark_delayed("waiting for degraded object");
  op->mark_delayed("waiting for cache not full");
  op->mark_delayed("waiting for clean to repair");
  op->mark_delayed("waiting for blocked object");
  op->mark_delayed("waiting for readable");
  op->mark_delayed("waiting for readable");
          op->mark_delayed("waiting for scrub");
          op->mark_delayed("waiting for readable");
    op->mark_delayed("waiting_for_map not empty");
      op->mark_delayed("waiting for peered");
    op->mark_delayed("waiting for flush");
      op->mark_delayed("waiting for active");
      op->mark_delayed("waiting for scrub");
	op->mark_delayed("waiting for ondisk");
    op->mark_delayed("waiting for rw locks");
	op->mark_delayed("waiting for scrub");
      op->mark_delayed("waiting for scrub");
  op->mark_delayed("waiting for missing object");

Version-Release number of selected component (if applicable):
RHCS 7

How reproducible:
Frequently

Steps to Reproduce:
1. Deploy ceph cluster
2. Run extensive client workload against the ceph cluster 
3. Observe "currently delayed" slow ops

Actual results:
The delayed ops does provide details on reason for op being flagged as delayed

Expected results:
The delayed ops should provide details on reason for op being flagged as delayed

Additional info:

--- Additional comment from Vikhyat Umrao on 2023-09-26 20:44:44 UTC ---

Marking this one blocker because it is a kind of regression and causing issues in troubleshooting slow requests!

--- Additional comment from Vikhyat Umrao on 2023-09-26 20:56:51 UTC ---

The issue was reported in ODF 4.10 which is nothing but 5.1.z2 - 16.2.7-126 hence changing the reported version to 5.1!

Comment 8 errata-xmlrpc 2024-02-08 16:55:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 Security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:0745


Note You need to log in before you can comment on or make changes to this bug.