Bug 2231437 - disk plug-in scalability issue
Summary: disk plug-in scalability issue
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: tuned
Version: 9.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Jaroslav Škarvada
QA Contact: Robin Hack
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-11 14:36 UTC by Jiří Mencák
Modified: 2023-08-11 14:49 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-165648 0 None None None 2023-08-11 14:38:00 UTC

Description Jiří Mencák 2023-08-11 14:36:41 UTC
Description of problem:
TuneD has a scalability issue when using the [disk] plug-in, which causes TuneD profile application taking a very long time.  With 2000 block devices, it can be 1-2 minutes.  This causes issues especially in OpenShift, where the Node Tuning Operator (NTO) waits for a profile application up to 1 minute and then kills TuneD and starts an exponential backoff giving TuneD more and more time to apply a profile.  Customers can then see TuneD restarts and taking a profile application up to 5 minutes.

One of the problems is that TuneD unnecessarily checks for APM using the hdparm command even if APM is not used by profiles.  Executing hdparm on each of the block devices takes quite a while and for hundreds (let alone thousands) of block devices this takes simply too lonw.

Version-Release number of selected component (if applicable):
All

How reproducible:
Always

Steps to Reproduce:
1. Use a system with 2k block devices.  If you don't have one handy, use the following script to create them:

for d in $(seq 1 2000)
do
  dmsetup create dummy$d --table '0 4092 zero'
#  dmsetup remove dummy$d
done

2. Use the throughput-performance profile which uses the [disk] plug-in.

Actual results:
TuneD takes 1-2 minutes to apply the profile.

Expected results:
TuneD takes max. 1s to apply the profile.

Additional info:
Problematic code: https://github.com/redhat-performance/tuned/blob/f4c976f2f5b0ddd541922ce54ed3ae7f4dbc0f84/tuned/plugins/plugin_disk.py#L107

Associated bugs: https://issues.redhat.com/browse/OCPBUGS-17531
Associated customer case: https://access.redhat.com/support/cases/#/case/03570928


Note You need to log in before you can comment on or make changes to this bug.