Description of problem: /etc/sysctl.d/vdsm.conf sets conflicting values from the virtual-host profile that ships in tuned: [root@node-32465 ~]# grep dirty /etc/sysctl.d/vdsm.conf # Set dirty page parameters vm.dirty_ratio = 5 vm.dirty_background_ratio = 2 [root@node-32465 ~]# grep dirty /usr/lib/tuned/throughput-performance/tuned.conf # The generator of dirty data starts writeback at this percentage (system default vm.dirty_ratio = 40 vm.dirty_background_ratio = 10 [root@node-32465 ~]# grep dirty /usr/lib/tuned/virtual-host/tuned.conf vm.dirty_background_ratio = 5 [root@node-32465 ~]# This causes `tuned-adm verify` to fail (unless setting reapply_sysctl=0 to /etc/tuned/tuned-main.conf) with the following error: 2018-04-08 06:01:31,946 ERROR tuned.plugins.base: verify: failed: 'vm.dirty_ratio' = '5', expected '40' 2018-04-08 06:01:31,975 ERROR tuned.plugins.base: verify: failed: 'vm.dirty_background_ratio' = '2', expected '5'
I'm not sure it's Virt or SLA, please change if I got it wrong
(In reply to Yuval Turgeman from comment #0) > Description of problem: > /etc/sysctl.d/vdsm.conf sets conflicting values from the virtual-host > profile that ships in tuned: > > > [root@node-32465 ~]# grep dirty /etc/sysctl.d/vdsm.conf > # Set dirty page parameters > vm.dirty_ratio = 5 > vm.dirty_background_ratio = 2 It's time we remove these. They were relevant, perhaps, in RHEL 6 times. Nir - thoughts?
They were added here: commit f4c534d78d7b88f9f4e31cecdc13d15d392421ac Author: Federico Simoncelli <fsimonce> Date: Mon Sep 26 15:28:36 2011 +0000 BZ#740887 Tune cache dirty ratio Tuning the dirty_ratio and dirty_background_ratio kernel parameters increases I/O throughput from the guests, improves fairness between the guests and reduces the ability of a buffered writer to starve guests. Change-Id: Ibf5c8e4c0637c60092b89fba103b96b37bdafaa0 Reviewed-on: http://gerrit.usersys.redhat.com/970 Reviewed-by: Dan Kenigsberg <danken> Tested-by: Dan Kenigsberg <danken>
According to https://bugzilla.redhat.com/show_bug.cgi?id=740887#c6 this was suggested by the performance team. It is possible that this is not needed in RHEL 7 but we don't have any evidence for that. If we want to remove this we need ack from the performance team.
(In reply to Nir Soffer from comment #4) > According to https://bugzilla.redhat.com/show_bug.cgi?id=740887#c6 this was > suggested by the performance team. It is possible that this is not needed in > RHEL 7 but we don't have any evidence for that. > > If we want to remove this we need ack from the performance team. Do we have something newer for RHEL 7?
I am ok with the settings that are part of Virtual Host profile. I was not aware that vdsmd was attempting to change them. Lowering the dirty_background_ratio to 2 is not necessarily a bad thing. It will start flushing dirty blocks in the background soon. But lowering dirty ratio to 5, is very aggressive and it can cause hosts to freeze every time the limit is hit, to complete flushing the data. That's why we recommended 10 for dirty_ratio. Was there a reason why vdsmd was attempting to tweak these values?
Sanjay, see https://bugzilla.redhat.com/show_bug.cgi?id=740887 for the reasons for these settings.
I think the reasoning makes sense because we want more free allocatable memory for VMs. So lowering the dirty_background_ratio makes sense. In fact on large memory system, we advise customers to even use dirty_background_bytes. For large memory systems, dirty_ratio=5 might also make sense but doing it across all implementations might cause the issue I mentioned in my comment above. We can consider changing dirty_background_ratio to 2 in virtual host profile and leave dirty_ratio at 10. I also see that swappiness is set to 10 in Virtual Host profile. That's also an important setting in reclaiming memory and putting it on freelist so that new VMs can have memory when they are started. That value can be lowered to 5. That increases CPU consumption but it prevents swapping on busy systems which is very expensive so it is worth it.
*** Bug 1565934 has been marked as a duplicate of this bug. ***
(In reply to Sanjay Rao from comment #8) > I think the reasoning makes sense because we want more free allocatable > memory for VMs. So lowering the dirty_background_ratio makes sense. In fact > on large memory system, we advise customers to even use > dirty_background_bytes. > > For large memory systems, dirty_ratio=5 might also make sense but doing it > across all implementations might cause the issue I mentioned in my comment > above. We can consider changing dirty_background_ratio to 2 in virtual host > profile and leave dirty_ratio at 10. > > I also see that swappiness is set to 10 in Virtual Host profile. That's also > an important setting in reclaiming memory and putting it on freelist so that > new VMs can have memory when they are started. That value can be lowered to > 5. That increases CPU consumption but it prevents swapping on busy systems > which is very expensive so it is worth it. I opened bug 1569375 on tuned to update. Once there we can remove the modifications in vdsm
As said in comment#10 we should have the correct tuned profile for hosts. Currently Performance QE is working on a replacement/update on the virtualization-host profile.
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both
sync2jira
Since the platform changes merged but still don't align, Martin suggested creating our own profile and shipping it.
Looking at the values currently reported from `tuned-adm verify', we are at the same place as in the beginning now. But as I interpret https://bugzilla.redhat.com/show_bug.cgi?id=1569375#c5, the platform believes the virtual-host profile settings are right. Insisting on our own values based on 10 years old measurements performed on NFS in RHEL 6 and making our own profile to resolve the discrepancy without further evidence doesn't sound very wise. Which doesn't mean our settings aren't still helpful in typical RHV use cases. I just can't see convincing facts anywhere that would tell us whether to follow Comment 10 or Comment 17. Claims like "the settings are fine" or "we don't align so let's create our own profile" don't seem to provide enough ground for an informed decision. Is there any background information that would help clarify what and *why* is the right thing to do?
I think we should use the existing virtual-host profile and if it is not good enough the tuned folks should improve it. This will benefit the entire platform instead of only RHV. If we have performance results with RHEL 8.4+ showing that current virtual-host profile is worse, and there is no way to improve it, it does make sense to create our own profile instead of modifying /etc/sysctl.d/ directly, which breaks tuned-adm. The profile is basically: $ cat /etc/tuned/ovirt-host/tuned.conf [main] summary=Optimize for running KVM guests on oVirt host include=virtual-host [sysctl] # Providing balance of low response time and high throughput when using # NFS storage. # See https://bugzilla.redhat.com/740887#c6 vm.dirty_ratio = 5 vm.dirty_background_ratio = 2
(In reply to Nir Soffer from comment #19) > I think we should use the existing virtual-host profile and if it is not > good enough the tuned folks should improve it. This will benefit the entire > platform instead of only RHV. > > If we have performance results with RHEL 8.4+ showing that current > virtual-host > profile is worse, and there is no way to improve it, it does make sense to > create our own profile instead of modifying /etc/sysctl.d/ directly, which > breaks tuned-adm. This sounds like the best way to handle it. I just wonder about Comment 17. Martin, do you remember why we still wanted to push our own profile, as mentioned in Comment 17?
We are past 4.5.0 feature freeze, please re-target.
no updates for a long time, missed 4.5 GA, closing