Bug 2189920 - osd already out. but still have slow request on it. [NEEDINFO]
Summary: osd already out. but still have slow request on it.
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 3.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 6.1z2
Assignee: Prashant Dhange
QA Contact: Pawan
URL:
Whiteboard:
: 2189921 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-26 12:52 UTC by shiqi
Modified: 2023-07-12 12:39 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
pdhiran: needinfo? (pdhange)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 50637 0 None None None 2023-06-09 21:50:00 UTC
Github ceph ceph pull 50543 0 None open mgr: Donot report slow ops warning if osd is down+out 2023-06-09 21:50:00 UTC
Red Hat Issue Tracker RHCEPH-6572 0 None None None 2023-04-26 12:53:19 UTC

Description shiqi 2023-04-26 12:52:29 UTC
Description of problem:
osd already out. but still have slow request on it.

2023-04-18 09:36:58.985728 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 104 slow requests are blocked > 32 sec. Implicated osds 15,46 (REQUEST_SLOW)
2023-04-18 09:37:58.829200 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 114 slow requests are blocked > 32 sec. Implicated osds 15,46 (REQUEST_SLOW)
2023-04-18 09:38:03.829544 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 119 slow requests are blocked > 32 sec. Implicated osds 15,46,91 (REQUEST_SLOW)

2023-04-18 09:38:45.081818 7f4649923700  0 mon.N-PC-SRH310-187@0(leader) e1 handle_command mon_command({"prefix": "osd out", "ids": ["15"]} v 0) v1
2023-04-18 09:38:45.081867 7f4649923700  0 log_channel(audit) log [INF] : from='client.869908682 -' entity='client.admin' cmd=[{"prefix": "osd out", "ids": ["15"]}]: dispatch
2023-04-18 09:38:45.081988 7f4649923700  0 log_channel(cluster) log [INF] : Client client.admin marked osd.15 out, while it was still marked up
2023-04-18 09:38:46.123486 7f464511a700  1 mon.N-PC-SRH310-187@0(leader).osd e30192 e30192: 288 total, 283 up, 266 in
2023-04-18 09:38:46.143387 7f464511a700  0 log_channel(audit) log [INF] : from='client.869908682 -' entity='client.admin' cmd='[{"prefix": "osd out", "ids": ["15"]}]': finished

2023-04-18 09:39:06.392868 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 129 slow requests are blocked > 32 sec. Implicated osds 15,150,224 (REQUEST_SLOW)
2023-04-18 09:39:39.115526 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 176 slow requests are blocked > 32 sec. Implicated osds 15,60,91,99,125,129,146,150,157,224 (REQUEST_SLOW)
2023-04-18 09:40:06.778683 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 76 slow requests are blocked > 32 sec. Implicated osds 15,224 (REQUEST_SLOW)
2023-04-18 09:40:30.580214 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 82 slow requests are blocked > 32 sec. Implicated osds 15,129,224 (REQUEST_SLOW)
2023-04-18 09:40:37.028340 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 75 slow requests are blocked > 32 sec. Implicated osds 15,224 (REQUEST_SLOW)
2023-04-18 09:41:01.157178 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 11 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:41:11.173565 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 6 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:44:50.783392 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 19 slow requests are blocked > 32 sec. Implicated osds 15,110 (REQUEST_SLOW)
2023-04-18 09:44:56.580815 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 22 slow requests are blocked > 32 sec. Implicated osds 15,17,236,247 (REQUEST_SLOW)
2023-04-18 09:45:17.868877 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 26 slow requests are blocked > 32 sec. Implicated osds 15,236 (REQUEST_SLOW)
2023-04-18 09:45:26.620679 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 28 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:45:30.987311 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 29 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:45:40.603139 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 30 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:45:50.918547 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 35 slow requests are blocked > 32 sec. Implicated osds 15,110 (REQUEST_SLOW)
2023-04-18 09:46:00.993006 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 37 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:46:10.744300 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 41 slow requests are blocked > 32 sec. Implicated osds 15 (REQUEST_SLOW)
2023-04-18 09:46:19.179848 7f464c128700  0 log_channel(cluster) log [WRN] : Health check failed: 48 slow requests are blocked > 32 sec. Implicated osds 15,236,247 (REQUEST_SLOW)
2023-04-18 09:46:55.641033 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 70 slow requests are blocked > 32 sec. Implicated osds 6,15,16,19,56,60,70,91,109,110,125,211,213,236 (REQUEST_SLOW)
2023-04-18 09:56:22.814425 7f464c128700  0 log_channel(cluster) log [WRN] : Health check update: 118 slow requests are blocked > 32 sec. Implicated osds 15,213,236 (REQUEST_SLOW)

2023-04-18 09:56:30.503006 7f4649923700  0 log_channel(cluster) log [INF] : osd.15 marked itself down

2023-04-18 10:05:29.076344 7f464c128700  0 log_channel(cluster) log [INF] : Cluster is now healthy

Version-Release number of selected component (if applicable):
RHCS3.1

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Prashant Dhange 2023-06-09 21:50:49 UTC
*** Bug 2189921 has been marked as a duplicate of this bug. ***

Comment 4 Scott Ostapovicz 2023-07-12 12:39:51 UTC
Missed the 6.1 z1 window.  Retargeting to 6.1 z2.


Note You need to log in before you can comment on or make changes to this bug.