RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1141705 - performance bug with PR_SET_TIMERSLACK in qemu-timer.c
Summary: performance bug with PR_SET_TIMERSLACK in qemu-timer.c
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: Yanhui Ma
URL:
Whiteboard:
Depends On: 1104748
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-15 09:44 UTC by Xiaomei Gao
Modified: 2017-11-29 15:29 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-29 15:29:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 2 Quan Wenli 2014-09-15 10:05:19 UTC
Hi, Stefan

Could you help look at this bug ? Thanks

Comment 3 Fam Zheng 2014-09-15 10:47:58 UTC
Maybe 1 nanosecond's time slack is too aggressive:

@@ -507,6 +511,9 @@ void init_clocks(void)
         vm_clock = qemu_clock_new(QEMU_CLOCK_VIRTUAL);
         host_clock = qemu_clock_new(QEMU_CLOCK_HOST);
     }
+#ifdef CONFIG_PRCTL_PR_SET_TIMERSLACK
+    prctl(PR_SET_TIMERSLACK, 1, 0, 0, 0);
+#endif
 }
 
 uint64_t timer_expire_time_ns(QEMUTimer *ts)

Comment 4 Paolo Bonzini 2014-09-15 16:30:10 UTC
It would help to know which timers are triggering the most.  Also, please add the patch to 1.5.3 and see if it is also causing increased CPU usage there.

Comment 5 Stefan Hajnoczi 2014-09-18 15:20:38 UTC
(In reply to Paolo Bonzini from comment #4)
> It would help to know which timers are triggering the most.

Try:

$ cat /proc/timer_list

Then search for qemu-kvm to see which kernel timers QEMU has active.  Any timers with <10 milliseconds remaining are worth investigating.

Xiaomei, can you please collect this information while a guest is running?

Comment 10 Stefan Hajnoczi 2014-11-07 12:45:45 UTC
There is a high chance that this is related to bz#1104748.

Since you are using -drive ...,aio=native the fix for bz#1104748 should reduce QEMU hrtimer usage.  As a result you should see lower host CPU utilization.

Please retest with the patch from bz#1104748.

Comment 11 Xiaomei Gao 2014-11-10 06:30:34 UTC
(In reply to Stefan Hajnoczi from comment #10)
> There is a high chance that this is related to bz#1104748.
> 
> Since you are using -drive ...,aio=native the fix for bz#1104748 should
> reduce QEMU hrtimer usage.  As a result you should see lower host CPU
> utilization.
> 
> Please retest with the patch from bz#1104748.

Okay, once bz#1104748 is fixed, we will repeat the tests to see if the issue has gone.

Comment 12 Quan Wenli 2014-11-12 07:33:07 UTC
(In reply to Stefan Hajnoczi from comment #10)
> There is a high chance that this is related to bz#1104748.
> 
> Since you are using -drive ...,aio=native the fix for bz#1104748 should
> reduce QEMU hrtimer usage.  As a result you should see lower host CPU
> utilization.
> 
> Please retest with the patch from bz#1104748.

Hi, Stefan, bz#1104748 fixs in qemu-kvm component. bz#966398 is the one for qemu-kvm-rhev. and bz#966398 is still on new status.

Comment 13 Stefan Hajnoczi 2014-11-17 17:19:23 UTC
(In reply to Quan Wenli from comment #12)
> (In reply to Stefan Hajnoczi from comment #10)
> > There is a high chance that this is related to bz#1104748.
> > 
> > Since you are using -drive ...,aio=native the fix for bz#1104748 should
> > reduce QEMU hrtimer usage.  As a result you should see lower host CPU
> > utilization.
> > 
> > Please retest with the patch from bz#1104748.
> 
> Hi, Stefan, bz#1104748 fixs in qemu-kvm component. bz#966398 is the one for
> qemu-kvm-rhev. and bz#966398 is still on new status.

The kernel fix in bz#1161535 should improve performance.  If you can test qemu-kvm-rhev on a host kernel with the bz#1161535 fix applied then we'll know whether the problem has be fully resolved.

An alternative is to rerun the original comparison again, but with aio=threads instead of aio=native.  The aio=threads code path avoids hrtimer usage and should therefore not be affected by timer slack.  This way we'll have an idea of whether there are additional places affected by timer slack besides aio=native (fixed in bz#1161535).

Comment 16 Stefan Hajnoczi 2015-08-03 14:43:57 UTC
This issue will require additional investigation.  Deferring to RHEL 7.3.

Comment 19 Stefan Hajnoczi 2016-01-29 14:10:48 UTC
Bumping to next release again.

Comment 20 Stefan Hajnoczi 2017-01-17 14:05:45 UTC
Low priority because I don't think a "fix" is possible.  Keeping it open because investigation could lead to a better understanding of timers in QEMU and CPU overhead.

Comment 21 Stefan Hajnoczi 2017-11-29 15:29:12 UTC
It's time to close this.  QEMU needs to offer accurate high-precision timers, so we cannot allow the host kernel scheduler to use a lot of timer slack.  This does incur higher overhead than scheduling timers with a coarse granularity, but users have not complained about this issue so I think we're making the right trade-off.


Note You need to log in before you can comment on or make changes to this bug.