Bug 446944
Summary: | Max latency near 1ms encountered while running cyclictest | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Clark Williams <williams> | ||||
Component: | realtime-kernel | Assignee: | Steven Rostedt <srostedt> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | beta | CC: | acme, bhu, jcm, lgoncalv, pzijlstr, srostedt | ||||
Target Milestone: | 1.0 | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | 2.6.24.7-65.el5rt | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-08-14 20:51:43 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 447235 | ||||||
Attachments: |
|
Description
Clark Williams
2008-05-16 18:17:01 UTC
Clark, try this: # sysctl kernel.sched_nr_migrate=2 And see if that solves you problem. With the new 2.6.24.7-rt7 coming out (with latency stealing), I was hitting that 900+ all the time. Looking at ftrace, I found that the load balancing was taking up to 400us at a time. Compounded by 4way or more boxes that can be running this balancing at the same time. My trace showed that the double_rq_lock took 424us!!! during this balance code and that was because the CPU of the rq it was trying to grab was also trying to do balancing. This needs to be noted (Lana ;-) Nevermind, I just hit the 900us+ latency with sched_nr_migrate=2. I'll look deeper into it. What ftrace setup are you using to try and trap this thing? If you can post a setup here, I'll try and duplicate it and get a trace for you. elevating to blocker status Created attachment 306803 [details]
allow delay to preempt
Seems that x86 delay uses tsc, and Andrew Morton made the delay non-preempt to
keep problems from non-synced tsc's from causing problems on SMP. This
disabling of preemption caused us large latencies.
The patch breaks this preemption and removes the 1ms latencies that we are
seeing.
Changing status to MODIFIED to show that this bug is considered resolved, but needs testing. The -rt11 version will contain this patch. kernel-rt-2.6.24.7-60.el5rt was built with -rt11 and is currently under test. Early signs show that the __delay fix has addressed this bug. fixed in GA Need to reopen and set state to Modified for 1.0.1 release. Closing. Fixed in GA. |