Description of problem: I am performing a benchmarking activity where i am supposed to run workload through COSBench. As soon as i am applying workload the OSDs in ceph cluster start to flap. Version-Release number of selected component (if applicable): Image : registry.redhat.io/rhceph/rhceph-4-rhel8:latest ceph version 14.2.4-125.el8cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable) How reproducible: ALWAYS, whenever i apply load Steps to Reproduce: 1. RHCS 4.0 Continerized deployment 2. 12 x RGW instances 3. Run COSBench workload (file attached) Actual results: - As soon as i am running cosbench workload, OSDs starts to flap and the entire cluster becomes unstable Expected results: - workload should complete performance degradation - OSDs should not flap Additional info: Ceph.conf in use ================= [client.rgw.rgw-5.rgw0] host = rgw-5 keyring = /var/lib/ceph/radosgw/ceph-rgw.rgw-5.rgw0/keyring log file = /var/log/ceph/ceph-rgw-rgw-5.rgw0.log rgw frontends = beast endpoint=172.18.7.55:8080 rgw thread pool size = 1024 [client.rgw.rgw-5.rgw1] host = rgw-5 keyring = /var/lib/ceph/radosgw/ceph-rgw.rgw-5.rgw1/keyring log file = /var/log/ceph/ceph-rgw-rgw-5.rgw1.log rgw frontends = beast endpoint=172.18.7.55:8081 rgw thread pool size = 1024 [global] bluefs_buffered_io = False cluster network = 172.18.7.0/24 fsid = ebe0aa4b-4fb5-4c68-84ab-cbf1118937a2 log_file = /dev/null max_open_files = 500000 mon host = [v2:172.18.7.51:3300,v1:172.18.7.51:6789],[v2:172.18.7.52:3300,v1:172.18.7.52:6789],[v2:172.18.7.53:3300,v1:172.18.7.53:6789] mon_allow_pool_delete = True mon_max_pg_per_osd = 1000 ms_dispatch_throttle_bytes = 1048576000 objecter_inflight_op_bytes = 1048576000 objecter_inflight_ops = 5120 osd_enable_op_tracker = False osd_max_pg_log_entries = 10 osd_min_pg_log_entries = 10 osd_pg_log_dups_tracked = 10 osd_pg_log_trim_min = 10 public network = 172.18.7.0/24 rgw_list_buckets_max_chunk = 999999 osd_op_thread_timeout = 900 osd_op_thread_suicide_timeout = 2000
Created attachment 1686216 [details] Logs from one of the OSD Attached are the logs from one of the OSD
Here are some Ceph commands outputs [root@rgw-4 /]# ceph -s cluster: id: ebe0aa4b-4fb5-4c68-84ab-cbf1118937a2 health: HEALTH_WARN 1 osds down Reduced data availability: 3 pgs inactive, 3 pgs incomplete Degraded data redundancy: 448392/146731677 objects degraded (0.306%), 345 pgs degraded 24 slow ops, oldest one blocked for 4014 sec, daemons [mon,rgw-1,mon,rgw-2,mon,rgw-3] have slow ops. services: mon: 3 daemons, quorum rgw-1,rgw-2,rgw-3 (age 47m) mgr: rgw-5(active, since 38m), standbys: rgw-4, rgw-6 osd: 318 osds: 317 up (since 3s), 318 in (since 47h); 21 remapped pgs rgw: 12 daemons active (rgw-1.rgw0, rgw-1.rgw1, rgw-2.rgw0, rgw-2.rgw1, rgw-3.rgw0, rgw-3.rgw1, rgw-4.rgw0, rgw-4.rgw1, rgw-5.rgw0, rgw-5.rgw1, rgw-6.rgw0, rgw-6.rgw1) task status: data: pools: 7 pools, 21760 pgs objects: 24.46M objects, 1.4 TiB usage: 260 TiB used, 4.5 PiB / 4.8 PiB avail pgs: 0.142% pgs not active 448392/146731677 objects degraded (0.306%) 21395 active+clean 269 active+undersized+degraded+remapped+backfill_wait 41 active+recovery_wait+undersized+degraded+remapped 28 activating+undersized+degraded+remapped 14 active+undersized 4 active+undersized+degraded 3 remapped+incomplete 3 active+undersized+remapped+backfill_wait 2 active+undersized+degraded+remapped+backfilling 1 active+recovering+undersized+degraded+remapped io: client: 1.7 KiB/s rd, 1 op/s rd, 0 op/s wr recovery: 9.8 KiB/s, 0 objects/s [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 4.8 PiB 4.5 PiB 260 TiB 260 TiB 5.34 TOTAL 4.8 PiB 4.5 PiB 260 TiB 260 TiB 5.34 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL .rgw.root 10 1.2 KiB 4 768 KiB 0 1.4 PiB default.rgw.control 11 0 B 8 0 B 0 1.5 PiB default.rgw.meta 12 302 KiB 1.25k 235 MiB 0 1.4 PiB default.rgw.log 13 0 B 208 0 B 0 1.4 PiB default.rgw.buckets.index 14 0 B 626 0 B 0 1.4 PiB default.rgw.data.root 15 0 B 0 0 B 0 1.4 PiB default.rgw.buckets.data 16 1.5 TiB 24.45M 8.7 TiB 0.20 2.9 PiB [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 4878.71826 root default -7 813.11523 host rgw-1 0 hdd 15.34180 osd.0 up 1.00000 1.00000 1 hdd 15.34180 osd.1 up 1.00000 1.00000 2 hdd 15.34180 osd.2 up 1.00000 1.00000 3 hdd 15.34180 osd.3 up 1.00000 1.00000 4 hdd 15.34180 osd.4 up 1.00000 1.00000 5 hdd 15.34180 osd.5 up 1.00000 1.00000 6 hdd 15.34180 osd.6 up 1.00000 1.00000 7 hdd 15.34180 osd.7 up 1.00000 1.00000 8 hdd 15.34180 osd.8 up 1.00000 1.00000 9 hdd 15.34180 osd.9 up 1.00000 1.00000 10 hdd 15.34180 osd.10 up 1.00000 1.00000 11 hdd 15.34180 osd.11 up 1.00000 1.00000 14 hdd 15.34180 osd.14 up 1.00000 1.00000 15 hdd 15.34180 osd.15 up 1.00000 1.00000 18 hdd 15.34180 osd.18 up 1.00000 1.00000 19 hdd 15.34180 osd.19 up 1.00000 1.00000 20 hdd 15.34180 osd.20 up 1.00000 1.00000 23 hdd 15.34180 osd.23 up 1.00000 1.00000 24 hdd 15.34180 osd.24 up 1.00000 1.00000 26 hdd 15.34180 osd.26 up 1.00000 1.00000 28 hdd 15.34180 osd.28 up 1.00000 1.00000 29 hdd 15.34180 osd.29 up 1.00000 1.00000 32 hdd 15.34180 osd.32 up 1.00000 1.00000 33 hdd 15.34180 osd.33 up 1.00000 1.00000 35 hdd 15.34180 osd.35 up 1.00000 1.00000 37 hdd 15.34180 osd.37 up 1.00000 1.00000 38 hdd 15.34180 osd.38 up 1.00000 1.00000 41 hdd 15.34180 osd.41 up 1.00000 1.00000 42 hdd 15.34180 osd.42 up 1.00000 1.00000 45 hdd 15.34180 osd.45 up 1.00000 1.00000 46 hdd 15.34180 osd.46 up 1.00000 1.00000 47 hdd 15.34180 osd.47 up 1.00000 1.00000 50 hdd 15.34180 osd.50 up 1.00000 1.00000 51 hdd 15.34180 osd.51 up 1.00000 1.00000 54 hdd 15.34180 osd.54 up 1.00000 1.00000 55 hdd 15.34180 osd.55 up 1.00000 1.00000 58 hdd 15.34180 osd.58 up 1.00000 1.00000 59 hdd 15.34180 osd.59 up 1.00000 1.00000 61 hdd 15.34180 osd.61 up 1.00000 1.00000 63 hdd 15.34180 osd.63 up 1.00000 1.00000 65 hdd 15.34180 osd.65 up 1.00000 1.00000 67 hdd 15.34180 osd.67 up 1.00000 1.00000 68 hdd 15.34180 osd.68 up 1.00000 1.00000 71 hdd 15.34180 osd.71 up 1.00000 1.00000 72 hdd 15.34180 osd.72 up 1.00000 1.00000 75 hdd 15.34180 osd.75 up 1.00000 1.00000 76 hdd 15.34180 osd.76 up 1.00000 1.00000 79 hdd 15.34180 osd.79 up 1.00000 1.00000 80 hdd 15.34180 osd.80 up 1.00000 1.00000 83 hdd 15.34180 osd.83 up 1.00000 1.00000 84 hdd 15.34180 osd.84 up 1.00000 1.00000 87 hdd 15.34180 osd.87 up 1.00000 1.00000 88 hdd 15.34180 osd.88 up 1.00000 1.00000 -5 813.11523 host rgw-2 13 hdd 15.34180 osd.13 up 1.00000 1.00000 17 hdd 15.34180 osd.17 up 1.00000 1.00000 22 hdd 15.34180 osd.22 up 1.00000 1.00000 27 hdd 15.34180 osd.27 up 1.00000 1.00000 31 hdd 15.34180 osd.31 up 1.00000 1.00000 36 hdd 15.34180 osd.36 up 1.00000 1.00000 40 hdd 15.34180 osd.40 up 1.00000 1.00000 44 hdd 15.34180 osd.44 up 1.00000 1.00000 49 hdd 15.34180 osd.49 up 1.00000 1.00000 53 hdd 15.34180 osd.53 up 1.00000 1.00000 57 hdd 15.34180 osd.57 up 1.00000 1.00000 62 hdd 15.34180 osd.62 up 1.00000 1.00000 66 hdd 15.34180 osd.66 up 1.00000 1.00000 70 hdd 15.34180 osd.70 up 1.00000 1.00000 74 hdd 15.34180 osd.74 up 1.00000 1.00000 78 hdd 15.34180 osd.78 up 1.00000 1.00000 82 hdd 15.34180 osd.82 up 1.00000 1.00000 86 hdd 15.34180 osd.86 up 1.00000 1.00000 90 hdd 15.34180 osd.90 up 1.00000 1.00000 92 hdd 15.34180 osd.92 up 1.00000 1.00000 94 hdd 15.34180 osd.94 up 1.00000 1.00000 96 hdd 15.34180 osd.96 up 1.00000 1.00000 98 hdd 15.34180 osd.98 up 1.00000 1.00000 100 hdd 15.34180 osd.100 up 1.00000 1.00000 102 hdd 15.34180 osd.102 up 1.00000 1.00000 104 hdd 15.34180 osd.104 up 1.00000 1.00000 106 hdd 15.34180 osd.106 up 1.00000 1.00000 108 hdd 15.34180 osd.108 up 1.00000 1.00000 110 hdd 15.34180 osd.110 up 1.00000 1.00000 112 hdd 15.34180 osd.112 up 1.00000 1.00000 114 hdd 15.34180 osd.114 up 1.00000 1.00000 116 hdd 15.34180 osd.116 up 1.00000 1.00000 118 hdd 15.34180 osd.118 up 1.00000 1.00000 120 hdd 15.34180 osd.120 up 1.00000 1.00000 122 hdd 15.34180 osd.122 up 1.00000 1.00000 124 hdd 15.34180 osd.124 up 1.00000 1.00000 126 hdd 15.34180 osd.126 up 1.00000 1.00000 128 hdd 15.34180 osd.128 up 1.00000 1.00000 130 hdd 15.34180 osd.130 up 1.00000 1.00000 132 hdd 15.34180 osd.132 up 1.00000 1.00000 134 hdd 15.34180 osd.134 up 1.00000 1.00000 136 hdd 15.34180 osd.136 up 1.00000 1.00000 138 hdd 15.34180 osd.138 up 1.00000 1.00000 140 hdd 15.34180 osd.140 up 1.00000 1.00000 142 hdd 15.34180 osd.142 up 1.00000 1.00000 144 hdd 15.34180 osd.144 up 1.00000 1.00000 146 hdd 15.34180 osd.146 up 1.00000 1.00000 148 hdd 15.34180 osd.148 up 1.00000 1.00000 150 hdd 15.34180 osd.150 up 1.00000 1.00000 152 hdd 15.34180 osd.152 up 1.00000 1.00000 154 hdd 15.34180 osd.154 up 1.00000 1.00000 156 hdd 15.34180 osd.156 up 1.00000 1.00000 158 hdd 15.34180 osd.158 up 1.00000 1.00000 -3 813.11523 host rgw-3 43 hdd 15.34180 osd.43 up 1.00000 1.00000 52 hdd 15.34180 osd.52 up 1.00000 1.00000 107 hdd 15.34180 osd.107 up 1.00000 1.00000 109 hdd 15.34180 osd.109 up 1.00000 1.00000 111 hdd 15.34180 osd.111 up 1.00000 1.00000 113 hdd 15.34180 osd.113 up 1.00000 1.00000 115 hdd 15.34180 osd.115 up 1.00000 1.00000 117 hdd 15.34180 osd.117 up 1.00000 1.00000 119 hdd 15.34180 osd.119 up 1.00000 1.00000 121 hdd 15.34180 osd.121 up 1.00000 1.00000 123 hdd 15.34180 osd.123 up 1.00000 1.00000 125 hdd 15.34180 osd.125 up 1.00000 1.00000 127 hdd 15.34180 osd.127 up 1.00000 1.00000 129 hdd 15.34180 osd.129 up 1.00000 1.00000 131 hdd 15.34180 osd.131 up 1.00000 1.00000 133 hdd 15.34180 osd.133 up 1.00000 1.00000 135 hdd 15.34180 osd.135 up 1.00000 1.00000 137 hdd 15.34180 osd.137 up 1.00000 1.00000 139 hdd 15.34180 osd.139 up 1.00000 1.00000 141 hdd 15.34180 osd.141 up 1.00000 1.00000 143 hdd 15.34180 osd.143 up 1.00000 1.00000 145 hdd 15.34180 osd.145 up 1.00000 1.00000 147 hdd 15.34180 osd.147 up 1.00000 1.00000 149 hdd 15.34180 osd.149 up 1.00000 1.00000 151 hdd 15.34180 osd.151 up 1.00000 1.00000 153 hdd 15.34180 osd.153 up 1.00000 1.00000 155 hdd 15.34180 osd.155 up 1.00000 1.00000 157 hdd 15.34180 osd.157 up 1.00000 1.00000 187 hdd 15.34180 osd.187 up 1.00000 1.00000 188 hdd 15.34180 osd.188 up 1.00000 1.00000 190 hdd 15.34180 osd.190 up 1.00000 1.00000 191 hdd 15.34180 osd.191 up 1.00000 1.00000 192 hdd 15.34180 osd.192 up 1.00000 1.00000 193 hdd 15.34180 osd.193 up 1.00000 1.00000 194 hdd 15.34180 osd.194 up 1.00000 1.00000 196 hdd 15.34180 osd.196 up 1.00000 1.00000 197 hdd 15.34180 osd.197 up 1.00000 1.00000 198 hdd 15.34180 osd.198 up 1.00000 1.00000 199 hdd 15.34180 osd.199 up 1.00000 1.00000 200 hdd 15.34180 osd.200 up 1.00000 1.00000 202 hdd 15.34180 osd.202 up 1.00000 1.00000 203 hdd 15.34180 osd.203 up 1.00000 1.00000 204 hdd 15.34180 osd.204 up 1.00000 1.00000 205 hdd 15.34180 osd.205 up 1.00000 1.00000 206 hdd 15.34180 osd.206 up 1.00000 1.00000 208 hdd 15.34180 osd.208 up 1.00000 1.00000 209 hdd 15.34180 osd.209 up 1.00000 1.00000 210 hdd 15.34180 osd.210 up 1.00000 1.00000 211 hdd 15.34180 osd.211 up 1.00000 1.00000 212 hdd 15.34180 osd.212 up 1.00000 1.00000 214 hdd 15.34180 osd.214 up 1.00000 1.00000 215 hdd 15.34180 osd.215 up 1.00000 1.00000 216 hdd 15.34180 osd.216 up 1.00000 1.00000 -9 813.11523 host rgw-4 265 hdd 15.34180 osd.265 up 1.00000 1.00000 268 hdd 15.34180 osd.268 up 1.00000 1.00000 271 hdd 15.34180 osd.271 up 1.00000 1.00000 273 hdd 15.34180 osd.273 up 1.00000 1.00000 276 hdd 15.34180 osd.276 up 1.00000 1.00000 279 hdd 15.34180 osd.279 up 1.00000 1.00000 281 hdd 15.34180 osd.281 up 1.00000 1.00000 284 hdd 15.34180 osd.284 up 1.00000 1.00000 287 hdd 15.34180 osd.287 up 1.00000 1.00000 290 hdd 15.34180 osd.290 up 1.00000 1.00000 293 hdd 15.34180 osd.293 up 1.00000 1.00000 296 hdd 15.34180 osd.296 up 1.00000 1.00000 299 hdd 15.34180 osd.299 up 1.00000 1.00000 301 hdd 15.34180 osd.301 up 1.00000 1.00000 304 hdd 15.34180 osd.304 up 1.00000 1.00000 307 hdd 15.34180 osd.307 up 1.00000 1.00000 310 hdd 15.34180 osd.310 up 1.00000 1.00000 313 hdd 15.34180 osd.313 up 1.00000 1.00000 316 hdd 15.34180 osd.316 up 1.00000 1.00000 319 hdd 15.34180 osd.319 up 1.00000 1.00000 321 hdd 15.34180 osd.321 up 1.00000 1.00000 324 hdd 15.34180 osd.324 up 1.00000 1.00000 327 hdd 15.34180 osd.327 up 1.00000 1.00000 329 hdd 15.34180 osd.329 up 1.00000 1.00000 332 hdd 15.34180 osd.332 up 1.00000 1.00000 335 hdd 15.34180 osd.335 up 1.00000 1.00000 338 hdd 15.34180 osd.338 up 1.00000 1.00000 341 hdd 15.34180 osd.341 up 1.00000 1.00000 344 hdd 15.34180 osd.344 up 1.00000 1.00000 347 hdd 15.34180 osd.347 up 1.00000 1.00000 349 hdd 15.34180 osd.349 up 1.00000 1.00000 352 hdd 15.34180 osd.352 up 1.00000 1.00000 354 hdd 15.34180 osd.354 up 1.00000 1.00000 357 hdd 15.34180 osd.357 up 1.00000 1.00000 360 hdd 15.34180 osd.360 up 1.00000 1.00000 363 hdd 15.34180 osd.363 up 1.00000 1.00000 366 hdd 15.34180 osd.366 up 1.00000 1.00000 369 hdd 15.34180 osd.369 up 1.00000 1.00000 372 hdd 15.34180 osd.372 up 1.00000 1.00000 375 hdd 15.34180 osd.375 up 1.00000 1.00000 378 hdd 15.34180 osd.378 up 1.00000 1.00000 380 hdd 15.34180 osd.380 up 1.00000 1.00000 382 hdd 15.34180 osd.382 up 1.00000 1.00000 385 hdd 15.34180 osd.385 up 1.00000 1.00000 388 hdd 15.34180 osd.388 up 1.00000 1.00000 391 hdd 15.34180 osd.391 up 1.00000 1.00000 394 hdd 15.34180 osd.394 up 1.00000 1.00000 397 hdd 15.34180 osd.397 up 1.00000 1.00000 399 hdd 15.34180 osd.399 down 1.00000 1.00000 402 hdd 15.34180 osd.402 up 1.00000 1.00000 404 hdd 15.34180 osd.404 up 1.00000 1.00000 407 hdd 15.34180 osd.407 up 1.00000 1.00000 410 hdd 15.34180 osd.410 up 1.00000 1.00000 -13 813.14203 host rgw-5 189 hdd 15.34279 osd.189 up 1.00000 1.00000 195 hdd 15.34279 osd.195 up 1.00000 1.00000 201 hdd 15.34180 osd.201 up 1.00000 1.00000 207 hdd 15.34180 osd.207 up 1.00000 1.00000 213 hdd 15.34180 osd.213 up 1.00000 1.00000 217 hdd 15.34180 osd.217 up 1.00000 1.00000 218 hdd 15.34279 osd.218 up 1.00000 1.00000 219 hdd 15.34180 osd.219 up 1.00000 1.00000 220 hdd 15.34279 osd.220 up 1.00000 1.00000 221 hdd 15.34180 osd.221 up 1.00000 1.00000 222 hdd 15.34279 osd.222 up 1.00000 1.00000 223 hdd 15.34180 osd.223 up 1.00000 1.00000 224 hdd 15.34279 osd.224 up 1.00000 1.00000 225 hdd 15.34180 osd.225 up 1.00000 1.00000 226 hdd 15.34279 osd.226 up 1.00000 1.00000 227 hdd 15.34180 osd.227 up 1.00000 1.00000 228 hdd 15.34279 osd.228 up 1.00000 1.00000 229 hdd 15.34180 osd.229 up 1.00000 1.00000 230 hdd 15.34279 osd.230 up 1.00000 1.00000 231 hdd 15.34180 osd.231 up 1.00000 1.00000 232 hdd 15.34279 osd.232 up 1.00000 1.00000 233 hdd 15.34180 osd.233 up 1.00000 1.00000 234 hdd 15.34180 osd.234 up 1.00000 1.00000 235 hdd 15.34279 osd.235 up 1.00000 1.00000 236 hdd 15.34180 osd.236 up 1.00000 1.00000 237 hdd 15.34279 osd.237 up 1.00000 1.00000 238 hdd 15.34180 osd.238 up 1.00000 1.00000 239 hdd 15.34279 osd.239 up 1.00000 1.00000 240 hdd 15.34180 osd.240 up 1.00000 1.00000 241 hdd 15.34279 osd.241 up 1.00000 1.00000 242 hdd 15.34180 osd.242 up 1.00000 1.00000 243 hdd 15.34279 osd.243 up 1.00000 1.00000 244 hdd 15.34180 osd.244 up 1.00000 1.00000 245 hdd 15.34279 osd.245 up 1.00000 1.00000 246 hdd 15.34180 osd.246 up 1.00000 1.00000 247 hdd 15.34279 osd.247 up 1.00000 1.00000 248 hdd 15.34180 osd.248 up 1.00000 1.00000 249 hdd 15.34279 osd.249 up 1.00000 1.00000 250 hdd 15.34180 osd.250 up 1.00000 1.00000 251 hdd 15.34279 osd.251 up 1.00000 1.00000 252 hdd 15.34180 osd.252 up 1.00000 1.00000 253 hdd 15.34279 osd.253 up 1.00000 1.00000 254 hdd 15.34180 osd.254 up 1.00000 1.00000 255 hdd 15.34279 osd.255 up 1.00000 1.00000 256 hdd 15.34180 osd.256 up 1.00000 1.00000 257 hdd 15.34279 osd.257 up 1.00000 1.00000 258 hdd 15.34180 osd.258 up 1.00000 1.00000 259 hdd 15.34279 osd.259 up 1.00000 1.00000 260 hdd 15.34180 osd.260 up 1.00000 1.00000 261 hdd 15.34279 osd.261 up 1.00000 1.00000 262 hdd 15.34279 osd.262 up 1.00000 1.00000 263 hdd 15.34279 osd.263 up 1.00000 1.00000 264 hdd 15.34279 osd.264 up 1.00000 1.00000 -11 813.11523 host rgw-6 12 hdd 15.34180 osd.12 up 1.00000 1.00000 16 hdd 15.34180 osd.16 up 1.00000 1.00000 21 hdd 15.34180 osd.21 up 1.00000 1.00000 25 hdd 15.34180 osd.25 up 1.00000 1.00000 30 hdd 15.34180 osd.30 up 1.00000 1.00000 34 hdd 15.34180 osd.34 up 1.00000 1.00000 39 hdd 15.34180 osd.39 up 1.00000 1.00000 48 hdd 15.34180 osd.48 up 1.00000 1.00000 56 hdd 15.34180 osd.56 up 1.00000 1.00000 60 hdd 15.34180 osd.60 up 1.00000 1.00000 64 hdd 15.34180 osd.64 up 1.00000 1.00000 69 hdd 15.34180 osd.69 up 1.00000 1.00000 73 hdd 15.34180 osd.73 up 1.00000 1.00000 77 hdd 15.34180 osd.77 up 1.00000 1.00000 81 hdd 15.34180 osd.81 up 1.00000 1.00000 85 hdd 15.34180 osd.85 up 1.00000 1.00000 89 hdd 15.34180 osd.89 up 1.00000 1.00000 91 hdd 15.34180 osd.91 up 1.00000 1.00000 93 hdd 15.34180 osd.93 up 1.00000 1.00000 95 hdd 15.34180 osd.95 up 1.00000 1.00000 97 hdd 15.34180 osd.97 up 1.00000 1.00000 99 hdd 15.34180 osd.99 up 1.00000 1.00000 101 hdd 15.34180 osd.101 up 1.00000 1.00000 103 hdd 15.34180 osd.103 up 1.00000 1.00000 105 hdd 15.34180 osd.105 up 1.00000 1.00000 159 hdd 15.34180 osd.159 up 1.00000 1.00000 160 hdd 15.34180 osd.160 up 1.00000 1.00000 161 hdd 15.34180 osd.161 up 1.00000 1.00000 162 hdd 15.34180 osd.162 up 1.00000 1.00000 163 hdd 15.34180 osd.163 up 1.00000 1.00000 164 hdd 15.34180 osd.164 up 1.00000 1.00000 165 hdd 15.34180 osd.165 up 1.00000 1.00000 166 hdd 15.34180 osd.166 up 1.00000 1.00000 167 hdd 15.34180 osd.167 up 1.00000 1.00000 168 hdd 15.34180 osd.168 up 1.00000 1.00000 169 hdd 15.34180 osd.169 up 1.00000 1.00000 170 hdd 15.34180 osd.170 up 1.00000 1.00000 171 hdd 15.34180 osd.171 up 1.00000 1.00000 172 hdd 15.34180 osd.172 up 1.00000 1.00000 173 hdd 15.34180 osd.173 up 1.00000 1.00000 174 hdd 15.34180 osd.174 up 1.00000 1.00000 175 hdd 15.34180 osd.175 up 1.00000 1.00000 176 hdd 15.34180 osd.176 up 1.00000 1.00000 177 hdd 15.34180 osd.177 up 1.00000 1.00000 178 hdd 15.34180 osd.178 up 1.00000 1.00000 179 hdd 15.34180 osd.179 up 1.00000 1.00000 180 hdd 15.34180 osd.180 up 1.00000 1.00000 181 hdd 15.34180 osd.181 up 1.00000 1.00000 182 hdd 15.34180 osd.182 up 1.00000 1.00000 183 hdd 15.34180 osd.183 up 1.00000 1.00000 184 hdd 15.34180 osd.184 up 1.00000 1.00000 185 hdd 15.34180 osd.185 up 1.00000 1.00000 186 hdd 15.34180 osd.186 up 1.00000 1.00000 [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# [root@rgw-4 /]# ceph osd dump | grep -i pool pool 10 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 12075 lfor 0/0/12064 flags hashpspool stripe_width 0 application rgw pool 11 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 12075 lfor 0/0/12068 flags hashpspool stripe_width 0 application rgw pool 12 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 12075 lfor 0/0/12066 flags hashpspool stripe_width 0 application rgw pool 13 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 12075 lfor 0/0/12068 flags hashpspool stripe_width 0 application rgw pool 14 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 4096 pgp_num 4096 autoscale_mode warn last_change 12172 lfor 0/0/12085 flags hashpspool stripe_width 0 application rgw pool 15 'default.rgw.data.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 12173 flags hashpspool stripe_width 0 application rgw pool 16 'default.rgw.buckets.data' erasure size 6 min_size 5 crush_rule 1 object_hash rjenkins pg_num 16384 pgp_num 16384 autoscale_mode warn last_change 12174 lfor 0/0/12162 flags hashpspool stripe_width 16384 application rgw [root@rgw-4 /]#
Created attachment 1686217 [details] COSBench Workload file Attaching cosbench workload file
Clarification on ceph -s output , here is the pattern - Multiple OSDS fails/flaps from 1 node (5-10) - All 53 OSDs flap from one node (but comes back up in say 30 seconds) I am also seeing slow ops [root@rgw-4 /]# ceph -s cluster: id: ebe0aa4b-4fb5-4c68-84ab-cbf1118937a2 health: HEALTH_WARN 1 osds down Reduced data availability: 2 pgs inactive, 19 pgs incomplete Degraded data redundancy: 491400/146999901 objects degraded (0.334%), 339 pgs degraded 24 slow ops, oldest one blocked for 4335 sec, daemons [mon,rgw-1,mon,rgw-2,mon,rgw-3] have slow ops.
Logs from one of the OSD: 2020-05-07 08:51:17.141 7fd8360af700 1 osd.399 pg_epoch: 15005 pg[16.3368s5( v 14993'1582 (14977'1572,14993'1582] lb MIN (bitwise) local-lis/les=14995/14996 n=0 ec=12150/12099 lis/c 15002/13313 les/c/f 15003/13317/0 15004/15005/14022) [35,56,203,74,254,399]/[35,56,203,74,254,2147483647]p35(0) r=-1 lpr=15005 pi=[13313,15005)/4 crt=14993'1582 lcod 0'0 remapped NOTIFY mbc={}] state<Start>: transitioning to Stray 2020-05-07 08:51:17.141 7fd8360af700 1 osd.399 pg_epoch: 15005 pg[16.35f7s5( v 14998'1548 (14929'1538,14998'1548] lb MIN (bitwise) local-lis/les=15000/15001 n=0 ec=12152/12099 lis/c 15002/13292 les/c/f 15003/13295/0 15004/15005/12251) [125,103,116,47,244,399]/[125,103,116,47,244,2147483647]p125(0) r=-1 lpr=15005 pi=[13292,15005)/3 crt=14998'1548 lcod 0'0 remapped NOTIFY mbc={}] start_peering_interval up [125,103,116,47,244,399] -> [125,103,116,47,244,399], acting [125,103,116,47,244,399] -> [125,103,116,47,244,2147483647], acting_primary 125(0) -> 125, up_primary 125(0) -> 125, role 5 -> -1, features acting 4611087854031667199 upacting 4611087854031667199 2020-05-07 08:51:17.141 7fd8360af700 1 osd.399 pg_epoch: 15005 pg[16.35f7s5( v 14998'1548 (14929'1538,14998'1548] lb MIN (bitwise) local-lis/les=15000/15001 n=0 ec=12152/12099 lis/c 15002/13292 les/c/f 15003/13295/0 15004/15005/12251) [125,103,116,47,244,399]/[125,103,116,47,244,2147483647]p125(0) r=-1 lpr=15005 pi=[13292,15005)/3 crt=14998'1548 lcod 0'0 remapped NOTIFY mbc={}] state<Start>: transitioning to Stray 2020-05-07 08:51:18.152 7fd8340ab700 0 log_channel(cluster) log [INF] : 16.3452s0 continuing backfill to osd.249(2) from (14980'1610,15001'1627] MIN to 15001'1627 2020-05-07 08:51:18.152 7fd8350ad700 0 log_channel(cluster) log [INF] : 16.1b78s0 continuing backfill to osd.65(5) from (14969'1510,15001'1524] MIN to 15001'1524 2020-05-07 08:51:18.155 7fd8340ab700 0 log_channel(cluster) log [INF] : 16.267es0 continuing backfill to osd.65(3) from (14969'1539,15001'1553] MIN to 15001'1553 /builddir/build/BUILD/ceph-14.2.4/src/osd/PGLog.h: In function 'void PGLog::IndexedLog::add(const pg_log_entry_t&, bool)' thread 7fd8360af700 time 2020-05-07 08:51:18.171087 /builddir/build/BUILD/ceph-14.2.4/src/osd/PGLog.h: 511: FAILED ceph_assert(head.version == 0 || e.version.version > head.version) ceph version 14.2.4-125.el8cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x561590c1234c] 2: (()+0x51f566) [0x561590c12566] 3: (bool PGLog::append_log_entries_update_missing<pg_missing_set<true> >(hobject_t const&, bool, std::__cxx11::list<pg_log_entry_t, mempool::pool_allocator<(mempool::pool_index_t)14, pg_log_entry_t> > const&, bool, PGLog::IndexedLog*, pg_missing_set<true>&, PGLog::LogEntryHandler*, DoutPrefixProvider const*)+0xaed) [0x561590e4442d] 4: (PGLog::merge_log(pg_info_t&, pg_log_t&, pg_shard_t, pg_info_t&, PGLog::LogEntryHandler*, bool&, bool&)+0xf6b) [0x561590e37a6b] 5: (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, pg_shard_t)+0x68) [0x561590d8e278] 6: (PG::RecoveryState::Stray::react(MLogRec const&)+0x23b) [0x561590dce9ab] 7: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xa5) [0x561590e2aa35] 8: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x5a) [0x561590df8bda] 9: (PG::do_peering_event(std::shared_ptr<PGPeeringEvent>, PG::RecoveryCtx*)+0x2c2) [0x561590de98e2] 10: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x2bc) [0x561590d2a36c] 11: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x55) [0x561590fa84b5] 12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x1366) [0x561590d26c36] 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x561591306134] 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x561591308cf4] 15: (()+0x82de) [0x7fd8575ac2de] 16: (clone()+0x43) [0x7fd856356133] *** Caught signal (Aborted) ** in thread 7fd8360af700 thread_name:tp_osd_tp 2020-05-07 08:51:18.174 7fd8360af700 -1 /builddir/build/BUILD/ceph-14.2.4/src/osd/PGLog.h: In function 'void PGLog::IndexedLog::add(const pg_log_entry_t&, bool)' thread 7fd8360af700 time 2020-05-07 08:51:18.171087 /builddir/build/BUILD/ceph-14.2.4/src/osd/PGLog.h: 511: FAILED ceph_assert(head.version == 0 || e.version.version > head.version) Suspected issue: https://tracker.ceph.com/issues/44532 https://github.com/ceph/ceph/pull/33910
Some more commentary per my discussion with Vikhyat So the cluster as running fine, I ran several cosbench tests successfully. Then all of a sudden, when i ran another test, I started to see this flapping issue. I stopped the workload and rested osds that are down for a long time. Then the cluster became healthy again. Post that i re-applied the load from cosbench and again started to see Flapping issues with OSDs. When i run watch "ceph -s", I can see the random amount of OSD going down 1,2, 10 or even 53 odds, and withing a few seconds they came back ... and this is a repeated pattern.
I received wonderful support from Vikhyat and Neha in this case, many thanks guys, you ROCK !! Happy to report that, changes mentioned by Neha i.e switching to default values for pg_log* tuneable have fixed the issue. I have not seen a similar assert since doing the mentioned changes, So this is not a bug. Going forward, my plan is to ingest 10 Billion RADOS objects via 12xRGW on to this cluster. So if I encounter this again, I probably will re-open this. Until them all good :)
(In reply to karan singh from comment #14) > I received wonderful support from Vikhyat and Neha in this case, many thanks > guys, you ROCK !! > > Happy to report that, changes mentioned by Neha i.e switching to default > values for pg_log* tuneable have fixed the issue. I have not seen a similar > assert since doing the mentioned changes, So this is not a bug. Well, if we allow our users to change from default settings, it is a bug. > > Going forward, my plan is to ingest 10 Billion RADOS objects via 12xRGW on > to this cluster. So if I encounter this again, I probably will re-open this. > Until them all good :) Based on the above, closing. Please re-open if we need to fix for 4.x.