Description of problem: Hosted Engine has an unsuitable I/O scheduler. My observations show that 'mq-deadline' does not provide any benefits, while 'none' brings better performance. Version-Release number of selected component (if applicable): All How reproducible: Every time Steps to Reproduce: 1. Install a Hosted Engine 2. Create a VM and check the time needed for adding a disk 3. Change scheduler to 'none' and repeat step 2 Actual results: With 'mq-deadline' the engine feels slow, guest disk creation time is longer than using 'none' Expected results: Default I/O scheduler to be set to 'none' by default.
Roy, any insight from scale team?
Just some info: Setting 'elevator=none' is not setting the devices' scheduler to 'none'. According to https://lists.debian.org/debian-kernel/2018/11/msg00141.html there is no equivalent to 'elevator=noop'. Most probably an udev rule can do the trick.
The following udev rules are working: ACTION=="add|change", KERNEL=="sd*", ATTR{queue/scheduler}="none" ACTION=="add|change", KERNEL=="vd*", ATTR{queue/scheduler}="none"
Tuning io shceduler is not the way how HE performance issues should be approached. I would not suggest changing it without full understanding why is it beneficial.
Hi Michal, So you would recommend to delay and queueour I/O in the mq-deadline scheduler, just to queue it again in the Host ? This didn't make sense. I am pretty sure that I know why we got better performance -> noop & none do not reorder any I/O , just stacking it. And then the Host will reorder and stack I/O from all guests (which is an expected and wanted behaviour). Best Regards, Strahil Nikolov
(In reply to Strahil Nikolov from comment #6) > I am pretty sure that I know why we got better performance -> noop & none > do not reorder any I/O , just stacking it. And then the Host will reorder > and stack I/O from all guests (which is an expected and wanted behaviour). If this is the case, I guess that a bug should be filled to tuned package so all guests will be tuned correctly. I'd like to get input from scale team anyway. Roy?
A suitable scheuduler is a matter of backing storage, and matter of use case. Setting to noop scheduler makes sense for fast disks and with heavy writes. We just completed some testing with thoughput-performance tuned profile on the hosted engine, hosting 5000 vms. It works well and shows improvements for some scenarios on our interal storages. ( the tuned disk plugin sets the i/o scheudler to deadline) . If you storage doesn't need reordering because you have almost no seek cost, good, then you need noop. I suggest to close this bug and to back up all our info with KB article with recommendations. (already few exists which are general)
I have opened this bug as I saw benefits when running the engine & host with the default scheduler on top of fluster (bricks on data 3 consumer hardware) which is one of the slowest available . I doubt someone will use IDE. Switching from mq-deadline to none brought better responsiveness and now with consumer sata3 ssd even better. In general , mq-deadline & deadline are recommended for VMs hosting Databases, but my experience led me to the fact that no matter what kind of DB is running, performance is always better if we only stack I/O without reordering (noop/none). If you see any benefits with mq-deadline - please share what kind of backend storage was used, so I can keep in mind. The proposal for KB article is nice.
It seems that my phone is correcting some of my words : So, fluster bricks were sata3 rotational consumer disks, while now is consumer sata3 ssd.
Results from scale testing showed no significant improvement for the RHV Engine in regards to application performance by using noop over deadline. While higher disk utilization was seen in noop the overall resources used and overall RHV Engine performance did not improve in a significant way to recommend noop as the default IO scheduler for the Hosted Engine at this. Scale env. tested both conditions 'none' & 'deadline' while running in background engine actions (Create VM, Copy/Move Disk, migrate VM). Each test was run for 3hours, and each server was monitor using nmon (3sec interval) Env.Description: 1 Host with HE (include DWH) 2 Hosts with 200VMs During the tests, only HE was running over the 1st host. All the exists VMs and the new VMs were run on 2nd&3rd hosts. By comparing the results of the actions (Response Time) there is no major gap (~1-2seconds) Regarding server results (CPU, Mem, Disk activity) Host level : CPU & Memory - results are almost the same Disk Activity - "deadline" consume ~10% more. HE level: CPU & Memory - results are almost the same Disk activity: DiskRead (Kb/s) - 'none' is higher than 'deadline' (3.33 vs 1) DiskWrite (Kb/s) - 'deadline' is higher than 'none' by ~10% (270 vs 240) Disk Utiliziation - The major gap while using 'none' (80% vs 0.5%) The extra disk utilization in noop was caused by postgres write activities. HE engine disk came from FC storage domain.
Re-targeting to 4.3.6 not being identified as blocker for 4.3.5.
so looks like none is better since it cause less disk activity on host. Let's adopt it.
Since there's not clear gain using a scheduler over the others, we can document this in Documentation / KB Article. Moving accordingly.
What is the user trying to accomplish? What exactly do you want to tell our users? Is this something to adjust after installation?
In this particular case, the HostedEngine's I/O scheduler is reordering I/O requests and then the Hypervisour that is hosting the HE is also doing it with it's own I/O scheduler. This double reordering is delaying I/O requests to the storage layer and performance is not optimal. The initial request was to use either 'noop' (no multiqueue) or 'none'(when multiqueue is ised) scheduler that will only merge I/O requests without any reorder (thus no delays). Based on user's experience adopting noop/none brings better preformance on the engine, but I guess it depends on the infrastructure and whole setup.
(In reply to Steve Goodman from comment #20) > What is the user trying to accomplish? Strahil Nikolov eplained in comment #21 > What exactly do you want to tell our users? That depending on their datacenter they may gain performance by changing the I/O scheduler to noop/none, > Is this something to adjust after installation? yes, after installation.
Is this how to change the I/O scheduler to noop/none? Add elevator=noop toGRUB_CMDLINE_LINUX in /etc/default/grub as shown below. Raw # cat /etc/default/grub [...] GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=vg00/lvroot rhgb quiet elevator=noop" [...] After the entry has been created/updated, rebuild the /boot/grub2/grub.cfg file to include the new configuration with the added parameter: On BIOS-based machines: ~]# grub2-mkconfig -o /boot/grub2/grub.cfg On UEFI-based machines: ~]# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg From https://access.redhat.com/solutions/5427
Based on https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html , it seems that 'elevator' is no longer existing - thus I prefer to use the UDEV rules approach.
Meital, Who on your team can help me with this? How do you change the I/O scheduler to noop/none?
Strahil is right, elevator option has been deprecated, see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/8.2_release_notes/deprecated_functionality#BZ-1665295 Steve, I would recommend to reference RHEL documentation here: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance avoiding to rewrite same documentation within oVirt / RHV.
I'll stick the following note at the end of https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html-single/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_command_line/index#Deploying_the_Self-Hosted_Engine_Using_the_CLI_install_RHVM [NOTE] ==== Both the {engine-name}'s I/O scheduler and the Hypervisor that is hosting the {engine-name} are reordering I/O requests. This double reordering delays I/O requests to the storage layer, impacting performance. Depending on your data center, you might gain performance by changing the I/O scheduler to `none`. For more information, see "Available disk schedulers" [1] in _Monitoring and managing system status and performance_ for RHEL. ==== [1] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/setting-the-disk-scheduler_monitoring-and-managing-system-status-and-performance
Sandro, Please review the merge request: https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1980 Give feedback on: - The location of the note - The text of the note
See the note in context in these preview builds: https://cee-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/CCS/job/ccs-mr-preview/36814/artifact/assembly-Installing_Red_Hat_Virtualization_as_a_self-hosted_engine_using_the_Cockpit_web_interface/preview/index.html#Deploying_the_Self-Hosted_Engine_Using_Cockpit_install_RHVM https://cee-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/CCS/job/ccs-mr-preview/36814/artifact/assembly-Installing_Red_Hat_Virtualization_as_a_self-hosted_engine_using_the_command_line/preview/index.html#Deploying_the_Self-Hosted_Engine_Using_the_CLI_install_RHVM The note is at the very end of these topics, right before 5.4
Looks good to me, also the position of the note looks good.
Richard, Can you please do a peer review?
(In reply to Steve Goodman from comment #31) > Richard, > > Can you please do a peer review? Steve, done. Donna -- one small issue and a suggestion. Aside from that, LGTM.
Merged.