Bug 1141705
Summary: | performance bug with PR_SET_TIMERSLACK in qemu-timer.c | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Xiaomei Gao <xigao> |
Component: | qemu-kvm-rhev | Assignee: | Stefan Hajnoczi <stefanha> |
Status: | CLOSED WONTFIX | QA Contact: | Yanhui Ma <yama> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.1 | CC: | chayang, famz, hhuang, huding, juzhang, michen, pbonzini, stefanha, virt-maint, wquan, xfu, yama |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-11-29 15:29:12 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1104748 | ||
Bug Blocks: |
Comment 2
Quan Wenli
2014-09-15 10:05:19 UTC
Maybe 1 nanosecond's time slack is too aggressive: @@ -507,6 +511,9 @@ void init_clocks(void) vm_clock = qemu_clock_new(QEMU_CLOCK_VIRTUAL); host_clock = qemu_clock_new(QEMU_CLOCK_HOST); } +#ifdef CONFIG_PRCTL_PR_SET_TIMERSLACK + prctl(PR_SET_TIMERSLACK, 1, 0, 0, 0); +#endif } uint64_t timer_expire_time_ns(QEMUTimer *ts) It would help to know which timers are triggering the most. Also, please add the patch to 1.5.3 and see if it is also causing increased CPU usage there. (In reply to Paolo Bonzini from comment #4) > It would help to know which timers are triggering the most. Try: $ cat /proc/timer_list Then search for qemu-kvm to see which kernel timers QEMU has active. Any timers with <10 milliseconds remaining are worth investigating. Xiaomei, can you please collect this information while a guest is running? There is a high chance that this is related to bz#1104748. Since you are using -drive ...,aio=native the fix for bz#1104748 should reduce QEMU hrtimer usage. As a result you should see lower host CPU utilization. Please retest with the patch from bz#1104748. (In reply to Stefan Hajnoczi from comment #10) > There is a high chance that this is related to bz#1104748. > > Since you are using -drive ...,aio=native the fix for bz#1104748 should > reduce QEMU hrtimer usage. As a result you should see lower host CPU > utilization. > > Please retest with the patch from bz#1104748. Okay, once bz#1104748 is fixed, we will repeat the tests to see if the issue has gone. (In reply to Stefan Hajnoczi from comment #10) > There is a high chance that this is related to bz#1104748. > > Since you are using -drive ...,aio=native the fix for bz#1104748 should > reduce QEMU hrtimer usage. As a result you should see lower host CPU > utilization. > > Please retest with the patch from bz#1104748. Hi, Stefan, bz#1104748 fixs in qemu-kvm component. bz#966398 is the one for qemu-kvm-rhev. and bz#966398 is still on new status. (In reply to Quan Wenli from comment #12) > (In reply to Stefan Hajnoczi from comment #10) > > There is a high chance that this is related to bz#1104748. > > > > Since you are using -drive ...,aio=native the fix for bz#1104748 should > > reduce QEMU hrtimer usage. As a result you should see lower host CPU > > utilization. > > > > Please retest with the patch from bz#1104748. > > Hi, Stefan, bz#1104748 fixs in qemu-kvm component. bz#966398 is the one for > qemu-kvm-rhev. and bz#966398 is still on new status. The kernel fix in bz#1161535 should improve performance. If you can test qemu-kvm-rhev on a host kernel with the bz#1161535 fix applied then we'll know whether the problem has be fully resolved. An alternative is to rerun the original comparison again, but with aio=threads instead of aio=native. The aio=threads code path avoids hrtimer usage and should therefore not be affected by timer slack. This way we'll have an idea of whether there are additional places affected by timer slack besides aio=native (fixed in bz#1161535). This issue will require additional investigation. Deferring to RHEL 7.3. Bumping to next release again. Low priority because I don't think a "fix" is possible. Keeping it open because investigation could lead to a better understanding of timers in QEMU and CPU overhead. It's time to close this. QEMU needs to offer accurate high-precision timers, so we cannot allow the host kernel scheduler to use a lot of timer slack. This does incur higher overhead than scheduling timers with a coarse granularity, but users have not complained about this issue so I think we're making the right trade-off. |