Bug 1955782
| Summary: | Ceph Monitors incorrectly report slow operations | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Neha Ojha <nojha> | ||||
| Component: | RADOS | Assignee: | Kefu Chai <kchai> | ||||
| Status: | CLOSED ERRATA | QA Contact: | skanta | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 4.2 | CC: | akupczyk, bhubbard, ceph-eng-bugs, dzafman, gfarnum, gsitlani, jbasquil, kchai, kjosy, mmurthy, nojha, owasserm, pdhiran, r.martinez, rzarzyns, skanta, sseshasa, tserlin, vereddy, vumrao | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.2z2 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | ceph-14.2.11-177.el8cp, ceph-14.2.11-177.el7cp | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1905339 | Environment: | |||||
| Last Closed: | 2021-06-15 17:14:17 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1905339 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
Created attachment 1785220 [details]
Error snippet
turns out the fix does not work in all cases. created https://github.com/ceph/ceph/pull/41516 to address this issue Moving the bug to verified state With the following successful steps:-
1. [root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]# ceph osd lspools
1 rbd
[root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]#
2.[root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]# rados bench -p rbd 300 write -b 8192 --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 8192 bytes to objects of size 8192 for up to 300 seconds or 0 objects
Object prefix: benchmark_data_ceph-bharath-1622458349362-no_86661
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 1102 1086 8.48496 8.48438 0.0125168 0.014662
2 16 2065 2049 8.00364 7.52344 0.0121516 0.0155233
3 16 2862 2846 7.41102 6.22656 0.0150411 0.016823
4 16 3976 3960 7.73377 8.70312 0.0111872 0.0161329
5 16 4812 4796 7.49313 6.53125 0.0120251 0.0166651
6 16 5756 5740 7.4733 7.375 0.0139815 0.0167099
7 16 6828 6812 7.60198 8.375 0.00679112 0.0164136
.................................................................................
...............................................................................
297 16 320093 320077 8.41853 10.2734 0.0206678 0.0148473
298 16 321517 321501 8.42761 11.125 0.0100201 0.0148315
299 16 322814 322798 8.43331 10.1328 0.0157222 0.0148213
Total time run: 300.007
Total writes made: 324097
Write size: 8192
Object size: 8192
Bandwidth (MB/sec): 8.43983
Stddev Bandwidth: 1.36136
Max bandwidth (MB/sec): 12.1797
Min bandwidth (MB/sec): 3.26562
Average IOPS: 1080
Stddev IOPS: 174.254
Max IOPS: 1559
Min IOPS: 418
Average Latency(s): 0.0148102
Stddev Latency(s): 0.0133563
Max latency(s): 0.434727
Min latency(s): 0.00250338
3.[root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]# ceph -s
cluster:
id: b6dabf91-1c96-45ee-9635-92961f393f9c
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-bharath-1622458349362-node2-mon,ceph-bharath-1622458349362-node3-mon-osd,ceph-bharath-1622458349362-node1-mon-mgr-installer (age 19m)
mgr: ceph-bharath-1622458349362-node1-mon-mgr-installer(active, since 19m)
osd: 14 osds: 14 up (since 16m), 14 in (since 16m)
data:
pools: 1 pools, 64 pgs
objects: 324.10k objects, 2.5 GiB
usage: 101 GiB used, 339 GiB / 441 GiB avail
pgs: 64 active+clean
4. [root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]# ceph daemon mon.`hostname` ops
{
"ops": [],
"num_ops": 0
}
[root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]#
CEPH Version-
[root@ceph-bharath-1622458349362-node1-mon-mgr-installer cephuser]# ceph -v
ceph version 14.2.11-177.el8cp (0486420967ea3327d3ba01d3184f3ab96ddaa616) nautilus (stable)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2445 *** Bug 1890899 has been marked as a duplicate of this bug. *** |
Facing the issue after performing the following steps- [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# ceph osd lspools 1 rbd [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# rados bench -p rbd 300 write -b 8192 --no-cleanup ......................................... ......................................... 296 16 273341 273325 7.21304 7.8125 0.0542765 0.0173262 297 16 274429 274413 7.21737 8.5 0.00938193 0.0173185 298 16 275714 275698 7.22684 10.0391 0.0118252 0.0172957 299 16 277085 277069 7.23849 10.7109 0.0120228 0.017268 Total time run: 300.007 Total writes made: 278447 Write size: 8192 Object size: 8192 Bandwidth (MB/sec): 7.25105 Stddev Bandwidth: 1.56969 Max bandwidth (MB/sec): 10.8359 Min bandwidth (MB/sec): 1.57812 Average IOPS: 928 Stddev IOPS: 200.921 Max IOPS: 1387 Min IOPS: 202 Average Latency(s): 0.0172383 Stddev Latency(s): 0.0188797 Max latency(s): 0.602089 Min latency(s): 0.0024007 [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# ceph daemon mon.`hostname` ops { "ops": [], "num_ops": 0 } [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# ceph -s cluster: id: 4fc966cd-df20-4772-8703-7fd99fd7355b health: HEALTH_WARN Long heartbeat ping times on back interface seen, longest is 63736.858 msec Long heartbeat ping times on front interface seen, longest is 63736.353 msec 8 slow ops, oldest one blocked for 1277 sec, mon.ceph-bharath-1621408854173-node2-mon has slow ops services: mon: 3 daemons, quorum ceph-bharath-1621408854173-node2-mon,ceph-bharath-1621408854173-node3-mon-osd,ceph-bharath-1621408854173-node1-mon-mgr-installer (age 13m) mgr: ceph-bharath-1621408854173-node1-mon-mgr-installer(active, since 11h) osd: 11 osds: 11 up (since 13m), 11 in (since 13m) data: pools: 1 pools, 64 pgs objects: 278.45k objects, 2.1 GiB usage: 86 GiB used, 239 GiB / 325 GiB avail pgs: 64 active+clean [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# ceph -v ceph version 14.2.11-170.el8cp (b49a031f4d70d49462afb70f730b6b346effdd14) nautilus (stable) [root@ceph-bharath-1621408854173-node1-mon-mgr-installer cephuser]# Error snippet is attached.