Bug 2300310 - [cee/sd] mclock_scheduler slow backfill [NEEDINFO]
Summary: [cee/sd] mclock_scheduler slow backfill
Keywords:
Status: CLOSED DUPLICATE of bug 2299482
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 6.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 6.2
Assignee: Sridhar Seshasayee
QA Contact: skanta
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-07-29 08:43 UTC by Tomas Petr
Modified: 2025-04-01 04:24 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-03-28 07:26:15 UTC
Embargoed:
sseshasa: needinfo? (tpetr)
sseshasa: needinfo?
sseshasa: needinfo? (tpetr)
kelwhite: needinfo? (tpetr)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 68224 0 None None None 2024-11-07 05:28:43 UTC
Red Hat Issue Tracker RHCEPH-9425 0 None None None 2024-07-29 08:45:57 UTC
Red Hat Knowledge Base (Solution) 7092973 0 None None None 2024-10-25 16:52:45 UTC

Description Tomas Petr 2024-07-29 08:43:00 UTC
Description of problem:
There is an issue with slow backfill when mclock_scheduler is used, the osd_mclock_override_recovery_settings=true + osd_max_backfills=8 does not seem to have any effect on amount of PGs being in backfilling state:
---
data:
    volumes: 1/1 healthy
    pools:   42 pools, 10488 pgs
    objects: 710.17M objects, 2.1 PiB
    usage:   5.4 PiB used, 16 PiB / 21 PiB avail
    pgs:     21327554/4573948158 objects misplaced (0.466%)
             10341 active+clean
             68    active+remapped+backfill_wait
             61    active+clean+scrubbing+deep
             14    active+clean+scrubbing
             4     active+remapped+backfilling
 
  io:
    client:   551 MiB/s rd, 23 MiB/s wr, 501 op/s rd, 611 op/s wr
    recovery: 21 MiB/s, 6 objects/s
---
ceph_config_dump:osd                                                                                                         advanced  osd_max_backfills                              8                                                                                                                                    
ceph_config_dump:osd                                                                                                         advanced  osd_mclock_override_recovery_settings          true     
---

The data moved are cephfs data pool EC profile = 8+3, the reason for data movement is to move data from old HDD OSDs to new HDD_ECC OSDs (both with block.db on SSD)

Version-Release number of selected component (if applicable):
RHCS 6.1z6
ceph version 17.2.6-216.el9cp

How reproducible:
Always in this environment

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Storage PM bot 2024-07-29 08:43:13 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 10 kelwhite 2024-08-21 22:10:35 UTC
Putting need info on Tomas for c#9


Note You need to log in before you can comment on or make changes to this bug.