Bug 1956387 - tuned: add tuned-adm verify --verbose option to help validate active profile settings [NEEDINFO]
Summary: tuned: add tuned-adm verify --verbose option to help validate active profile ...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: tuned
Version: 8.5
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: beta
: ---
Assignee: Jaroslav Škarvada
QA Contact: Robin Hack
URL:
Whiteboard:
Depends On:
Blocks: 1932086
TreeView+ depends on / blocked
 
Reported: 2021-05-03 15:23 UTC by Prasad J Pandit
Modified: 2021-05-13 08:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
ppandit: needinfo? (jskarvad)


Attachments (Terms of Use)

Description Prasad J Pandit 2021-05-03 15:23:31 UTC
Description of problem:

* Verification and validation of an active tuned(8) profile is a two fold issue:

1) $ tuned-adm(8) verify  - is being worked on at => BZ#1947858

This bug is for the second part below

2) $ tuned-adm(8) verify
    - Does not offer convenient user-interface/interaction for users to be able to see and confirm
      that tuned-adm(8) verify is working well.
    - It does not present verification results to a user properly.
      Verification results get logged to the /var/log/tnued/tuned.log file.
===
# tuned-adm verify
Verification failed, current system settings differ from the preset profile.
You can mostly fix this by restarting the Tuned daemon, e.g.:
  systemctl restart tuned
or
  service tuned restart
Sometimes (if some plugins like bootloader are used) a reboot may be required.
See tuned log file ('/var/log/tuned/tuned.log') for details.
===

$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.4 Beta (Ootpa)
$ rpm -q tuned
tuned-2.15.0-2.el8.noarch
$ 
$ tuned --versio
tuned 2.15.0


How reproducible: 100%


Steps to Reproduce:
1. install tuned
2. activate realtime profile
3. run tuned-adm verify

Expected results:
  - Make it easy for users to validate active profile settings with certainty.

Comment 1 Prasad J Pandit 2021-05-03 15:25:26 UTC
This bug is to address part 2) -> https://bugzilla.redhat.com/show_bug.cgi?id=1947858#c4

Comment 2 Prasad J Pandit 2021-05-04 13:26:46 UTC
===
[root@virtlab500 tuned]# tuned-adm active
Current active profile: balanced
[root@virtlab500 tuned]# 

[root@virtlab500 tuned]# tuned-adm verify 
Verfication succeeded, current system settings match the preset profile.
See tuned log file ('/var/log/tuned/tuned.log') for details.
[root@virtlab500 tuned]# 

[root@virtlab500 tuned]# tuned-adm verify --verbose |less
[cpu]
 governor=conservative|powersave
 energy_perf_bias=normal
Verification: Pass

[sysctl]
 kernel.sched_min_granularity_ns=3000000
 kernel.sched_wakeup_granularity_ns=4000000
 vm.dirty_ratio=10
 vm.dirty_background_ratio=3
 vm.swappiness=10
 kernel.sched_migration_cost_ns=5000000
 net.core.busy_read=None, expected: 50
 net.core.busy_poll=None, expected: 50
 net.ipv4.tcp_fastopen=3
 kernel.numa_balancing=None, expected: 0
 kernel.hung_task_timeout_secs=600
 kernel.nmi_watchdog=0
 kernel.sched_rt_runtime_us=-1
 vm.stat_interval=10
 kernel.timer_migration=0
Verification: Fail

...
[sysfs]
 /sys/bus/workqueue/devices/writeback/cpumask=f55f
 /sys/devices/virtual/workqueue/cpumask=f55f
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
 ...
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
 /sys/kernel/mm/ksm/run=2
 /sys/kernel/ktimer_lockless_check=0, expected: 1
Verification: Fail

[modules]
 cpufreq_conservative: verify: failed: 'module 'cpufreq_conservative' is not loaded'
Verification: Fail

[audio]
 []
Verification: Pass

[video]
 radeon_powersave=dpm-balanced, auto
Verification: Pass

[disk]
 []
Verification: Pass

[scsi_host]
 alpm=medium_power
  host0=None
Verification: Fail

See tuned log file ('/var/log/tuned/tuned.log') for details.
===

@Jaroslav:
   current patch enables 'tuned-adm verify' to display output as above.
   before I raise a PR, just wanted to check if you've have any inputs/thoughts/suggestions?

Thank you.

Comment 3 Nitesh Narayan Lal 2021-05-04 16:10:32 UTC
Nice work, can you also share the output of tuned-adm verify without verbose?
Will it only report whether the verification has passed or failed?

Have you also looked into the scheduler verification?

Comment 4 Marcelo Tosatti 2021-05-04 16:58:02 UTC
(In reply to Prasad J Pandit from comment #2)
> ===
> [root@virtlab500 tuned]# tuned-adm active
> Current active profile: balanced
> [root@virtlab500 tuned]# 
> 
> [root@virtlab500 tuned]# tuned-adm verify 
> Verfication succeeded, current system settings match the preset profile.
> See tuned log file ('/var/log/tuned/tuned.log') for details.
> [root@virtlab500 tuned]# 
> 
> [root@virtlab500 tuned]# tuned-adm verify --verbose |less
> [cpu]
>  governor=conservative|powersave
>  energy_perf_bias=normal
> Verification: Pass
...

Looks good to me! Thanks.

Comment 5 Prasad J Pandit 2021-05-05 12:20:25 UTC
(In reply to Nitesh Narayan Lal from comment #3)
> Nice work, can you also share the output of tuned-adm verify without verbose?
> Will it only report whether the verification has passed or failed?

Yes, above comment shows an output of the default 'tuned-adm verify' run.
There's no change to default behaviour.

> Have you also looked into the scheduler verification?

Yes, currently it looks as below

[root@virtlab500 tuned]# tuned-adm verify --verbose |less
...
[scheduler]
 ps_whitelist=None
 ps_blacklist=None, expected: ksoftirqd.*;rcuc.*;rcub.*;ktimersoftd.*;.*pmd.*;.*PMD.*;^DPDK;.*qemu-kvm.*
 default_irq_smp_affinity=None, expected: calc
 perf_process_fork=None, expected: false
 isolated_cores=None, expected: 5,7,9,11
Verification: Fail

Thank you.

Comment 6 Nitesh Narayan Lal 2021-05-05 13:55:24 UTC
(In reply to Prasad J Pandit from comment #5)
> (In reply to Nitesh Narayan Lal from comment #3)
> > Nice work, can you also share the output of tuned-adm verify without verbose?
> > Will it only report whether the verification has passed or failed?
> 
> Yes, above comment shows an output of the default 'tuned-adm verify' run.
> There's no change to default behaviour.
> 

Maybe we can improve this as well, for example, right now it shows:

# tuned-adm active
Current active profile: realtime-virtual-host

# tuned-adm verify
Verification failed, current system settings differ from the preset profile.
You can mostly fix this by restarting the Tuned daemon, e.g.:
  systemctl restart tuned
or
  service tuned restart
Sometimes (if some plugins like bootloader are used) a reboot may be required.
See tuned log file ('/var/log/tuned/tuned.log') for details.

The suggestion to restart tuned in itself is pretty vague IMHO.
Instead, it will be useful if we can print something like the number of errors
that we triggered during the verification here.

What do you think?

Thanks

Comment 7 Prasad J Pandit 2021-05-06 16:41:09 UTC
(In reply to Nitesh Narayan Lal from comment #6)
> Maybe we can improve this as well, for example, right now it shows:
... 
> The suggestion to restart tuned in itself is pretty vague IMHO.
> Instead, it will be useful if we can print something like the number of
> errors that we triggered during the verification here.

* True, restarting tuned(8) does not seem an effective solution.
  Sometimes current system settings may differ because the system
  does not support certain option ex.

  => tuned.plugins.plugin_vm: Option 'transparent_hugepages' is not supported on current hardware.

OR if an error/exception occurs while setting some parameter ex.

  ...
   File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_net.py", line 400, in _set_device_parameters
     if context == "channels" and int(dev_params[next(iter(d))]) == 0:
  => ValueError: invalid literal for int() with base 10: 'n/a'

OR if the kernel does not define certain parameter

  => tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'kernel.numa_balancing', the parameter does not exist


In such situations, restarting tuned(8) isn't going to be helpful.


* I guess absent kernel parameter or system support error could be result of different kernel versions
  and/or system capabilities. They need to be addressed by tweaking tuned(8) profiles for each supported
  RHEL versions.

  I think it can be addressed as another new bug fix, instead of combining with this one.
  Because tweaking each profile parameter for different RHEL versions may take more time/testing.


...wdyt?

Thank you.

Comment 8 Nitesh Narayan Lal 2021-05-06 17:09:59 UTC
(In reply to Prasad J Pandit from comment #7)
> (In reply to Nitesh Narayan Lal from comment #6)
> > Maybe we can improve this as well, for example, right now it shows:
> ... 
> > The suggestion to restart tuned in itself is pretty vague IMHO.
> > Instead, it will be useful if we can print something like the number of
> > errors that we triggered during the verification here.
> 
> * True, restarting tuned(8) does not seem an effective solution.
>   Sometimes current system settings may differ because the system
>   does not support certain option ex.
> 
>   => tuned.plugins.plugin_vm: Option 'transparent_hugepages' is not
> supported on current hardware.

Exactly

> 
> OR if an error/exception occurs while setting some parameter ex.
> 
>   ...
>    File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_net.py", line
> 400, in _set_device_parameters
>      if context == "channels" and int(dev_params[next(iter(d))]) == 0:
>   => ValueError: invalid literal for int() with base 10: 'n/a'
> 


This is a bug reported in BZ 1943291:)
The fix is upstream but I don't think it has been backported yet.


> OR if the kernel does not define certain parameter
> 
>   => tuned.plugins.plugin_sysctl: Failed to read sysctl parameter
> 'kernel.numa_balancing', the parameter does not exist
> 
> 
> In such situations, restarting tuned(8) isn't going to be helpful.
> 

Right.

> 
> * I guess absent kernel parameter or system support error could be result of
> different kernel versions
>   and/or system capabilities. They need to be addressed by tweaking tuned(8)
> profiles for each supported
>   RHEL versions.
> 
>   I think it can be addressed as another new bug fix, instead of combining
> with this one.
>   Because tweaking each profile parameter for different RHEL versions may
> take more time/testing.
> 
> 
> ...wdyt?
> 
> Thank you.

Yeah, I agree to fix the individual plugins we should open a new BZ.
However, what we can do here (in your upstream PR) is just improve this
messaging that we get when verification fails without the verbose option.

Thanks

Comment 9 Prasad J Pandit 2021-05-10 12:45:42 UTC
New output with and without --verbose option

===
[root@virtlab500 ~]# 
[root@virtlab500 ~]# tuned-adm verify
Active profile => realtime-virtual-host
Verification failed, current system settings differ from the preset profile.
 * Use 'tuned-adm verify --verbose' option to know differing system settings.
 * Sometimes (if some plugins like bootloader are used) a reboot may be required.
 * See tuned log file ('/var/log/tuned/tuned.log') for details.
===

[root@virtlab500 ~]# 
[root@virtlab500 ~]# tuned-adm verify -v |less
Active profile => realtime-virtual-host
[cpu]
 governor=performance
 energy_perf_bias=performance
Verification: Pass

[sysctl]
 kernel.sched_min_granularity_ns=3000000
 kernel.sched_wakeup_granularity_ns=4000000
 vm.dirty_ratio=10
 vm.dirty_background_ratio=3
 vm.swappiness=10
 kernel.sched_migration_cost_ns=5000000
 net.core.busy_read=None, [Expected: 50]
 net.core.busy_poll=None, [Expected: 50]
 net.ipv4.tcp_fastopen=3
 kernel.numa_balancing=None, [Expected: 0]
 kernel.hung_task_timeout_secs=600
 kernel.nmi_watchdog=0
 kernel.sched_rt_runtime_us=-1
 vm.stat_interval=10
 kernel.timer_migration=0
Verification: Fail

[sysfs]
 /sys/bus/workqueue/devices/writeback/cpumask=f55f
 /sys/devices/virtual/workqueue/cpumask=f55f
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
 /sys/devices/system/machinecheck/machinecheck*/ignore_ce=1
...
[irqbalance]
 banned_cpus=None, [Expected: 5,7,9,11]
Verification: Fail

[script]
 scripts: verify: failed: '['/usr/lib/tuned/realtime/script.sh', '/usr/lib/tuned/realtime-virtual-host/script.sh']'                                                                           
Verification: Fail

[scheduler]
 ps_whitelist=None
 ps_blacklist=None, [Expected: ksoftirqd.*;rcuc.*;rcub.*;ktimersoftd.*;.*pmd.*;.*PMD.*;^DPDK;.*qemu-kvm.*]                                                                                    
 default_irq_smp_affinity=None, [Expected: calc]
 perf_process_fork=None, [Expected: false]
 isolated_cores=None, [Expected: 5,7,9,11]
Verification: Fail

Verification failed, current system settings differ from the preset profile.
 * Sometimes (if some plugins like bootloader are used) a reboot may be required.
 * See tuned log file ('/var/log/tuned/tuned.log') for details.
===

Comment 10 Nitesh Narayan Lal 2021-05-10 13:34:05 UTC
Looks good, thanks.

Comment 11 Prasad J Pandit 2021-05-10 19:12:58 UTC
PR#342 -> https://github.com/redhat-performance/tuned/pull/342

Comment 12 Prasad J Pandit 2021-05-13 08:26:37 UTC
Revised PR#346: RHBZ#1956387 tuned-adm: add --verbose option to verify command
  -> https://github.com/redhat-performance/tuned/pull/346


Note You need to log in before you can comment on or make changes to this bug.