Bug 1182094
| Summary: | vdsm NUMA code not effective, slowing down statistics retrieval | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Michal Skrivanek <michal.skrivanek> | |
| Component: | vdsm | Assignee: | Martin Sivák <msivak> | |
| Status: | CLOSED ERRATA | QA Contact: | Artyom <alukiano> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 3.5.0 | CC: | bazulay, dfediuck, fromani, gklein, istein, lpeer, lsurette, melewis, mkalinin, rhodain, sherold, yeylon, ykaul | |
| Target Milestone: | ovirt-3.6.0-rc | Keywords: | Triaged, ZStream | |
| Target Release: | 3.6.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | ovirt-3.6.0-alpha1.2 | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, NUMA statistics were collected every time VDSM was queried for host statistics. This resulted in a higher load and unnecessary delays as collecting the data was time consuming as an external process was executed. Now, NUMA statistic collection has been moved to the statistics threads and the host statistic query reports the last collected result.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1220113 (view as bug list) | Environment: | ||
| Last Closed: | 2016-03-09 19:29:13 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1177634, 1185279, 1220113 | |||
|
Description
Michal Skrivanek
2015-01-14 12:39:21 UTC
in addition see point 3 in https://bugzilla.redhat.com/show_bug.cgi?id=1185279#c1 for NUMA issue in host monitoring 3.5.1 is already full with bugs (over 80), and since none of these bugs were added as urgent for 3.5.1 release in the tracker bug, moving to 3.5.2 The patch is posted and improvement was measured to be about 12ms per VM per call. Two NUMA enabled VMs caused the following difference in time for x in $(seq 100); do vdsClient -s 0 getAllVmStats >/dev/null; done Old VDSM: real 0m21.093s user 0m11.998s sys 0m1.690s Updated VDSM: real 0m18.485s user 0m12.009s sys 0m1.846s And a control timing of two VMs without NUMA: real 0m18.298s user 0m11.878s sys 0m1.699s As you can see the time difference for 100 calls was 2.5 seconds. But just to make everything clear, all NUMA related code was introduced in 3.5. So it should not affect 3.4 and the issue there is something different. The main issue is fixed. Verified on vdsm-4.17.0-822.git9b11a18.el7.noarch Run two vms with two cpu's, without NUMA: [root@alma06 ~]# time for x in $(seq 100); do vdsClient -s 0 getAllVmStats >/dev/null; done real 0m14.549s user 0m11.643s sys 0m2.316s With NUMA: [root@alma06 ~]# time for x in $(seq 100); do vdsClient -s 0 getAllVmStats >/dev/null; done real 0m14.570s user 0m11.632s sys 0m2.370s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0362.html |