Bug 1737628
Summary: | [RHEL8] Tuned setting C-state0 instead of C-state1 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Joe Mario <jmario> | ||||
Component: | tuned | Assignee: | Jaroslav Škarvada <jskarvad> | ||||
Status: | CLOSED ERRATA | QA Contact: | qe-baseos-daemons | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 8.0 | CC: | atheurer, djdumas, jeder, jmencak, jskarvad, knoel, krister, lcapitulino, olysonek, pezhang, psklenar, rhack, williams | ||||
Target Milestone: | rc | Keywords: | Patch, TestCaseProvided, Upstream | ||||
Target Release: | 8.0 | Flags: | pm-rhel:
mirror+
|
||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | tuned-2.12.0-3.el8 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-11-05 22:31:22 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Joe Mario
2019-08-05 21:33:33 UTC
Created attachment 1600778 [details]
Exit latencies for given c-states on Intel cpus supported in RHEL-8
Problem statement: Tuned lets users specify desired c-states via the "forced_latency" knob in the tuned conf files. This just happened to work fine on RHEL-7 and RHEL-6 when "forced_latency=1" was used to set cstate1 (C1), but it no longer works on RHEL-8 due to a change in the kernel. That "forced_latency=1" setting now causes C0 to be set. The RHEL-8 kernel change is an intentional desired change. (See https://bugzilla.redhat.com/show_bug.cgi?id=1737276 ). It will remain and will not be backported to RHEL-7. We need to modify tuned and find a reliable way for users to specify desired cstate values. We could use the /sys area, for example from a Broadwell: /sys/devices/system/cpu/cpu0/cpuidle/state0/name:POLL /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu0/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2 /sys/devices/system/cpu/cpu0/cpuidle/state2/name:C1E /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10 /sys/devices/system/cpu/cpu0/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:40 /sys/devices/system/cpu/cpu0/cpuidle/state4/name:C6 /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133 This should work on Intel and AMD cpus. I have not looked at other arches. Any ideas for how this can best be handled? We should also make fixing this a priority on RHEL-8. Right now users of various tuned profiles have their cpus running in C0, which means no turbo mode, higher power consumption, and hotter cpus (which get throttled down to cool them off). Hi Jaroslav: Do you have any thoughts on moving this BZ forward? Given we have customers who are already using the "forced_latency=<n>" in their private tuned profiles, how about the following? a) We leave the existing "forced_latency=<n>" in place. b) We create a new cstate interface. For example: "cstate=<n>" Tuned then reads the appropriate /sys/devices/system/cpu/cpu0/cpuidle/state<n>/latency file to figure out what latency to specify. c) We then switch all the Red Hat tuned profiles that use "forced_latency=<n>" to use "cstate=<n>". It shouldn't be many. d) This would only be for RHEL-8. This is pretty important to resolve this quickly for RHEL-8. Thoughts? Thank you. Joe (In reply to Joe Mario from comment #3) > Hi Jaroslav: > Do you have any thoughts on moving this BZ forward? > > Given we have customers who are already using the "forced_latency=<n>" in > their private tuned profiles, how about the following? > > a) We leave the existing "forced_latency=<n>" in place. > > b) We create a new cstate interface. For example: "cstate=<n>" > Tuned then reads the appropriate > /sys/devices/system/cpu/cpu0/cpuidle/state<n>/latency file to figure out > what latency to specify. > > c) We then switch all the Red Hat tuned profiles that use > "forced_latency=<n>" to use "cstate=<n>". It shouldn't be many. > > d) This would only be for RHEL-8. > > This is pretty important to resolve this quickly for RHEL-8. > > Thoughts? > > Thank you. > Joe We have been thinking about it in the past. We could also extend the syntax to allow C-state to be entered as a force_latency parameter, e.g.: force_latency=10 # 10 us ... force_latency=C1 # C1 or less (In reply to Jaroslav Škarvada from comment #4) > (In reply to Joe Mario from comment #3) > > Hi Jaroslav: > > Do you have any thoughts on moving this BZ forward? > > > > Given we have customers who are already using the "forced_latency=<n>" in > > their private tuned profiles, how about the following? > > > > a) We leave the existing "forced_latency=<n>" in place. > > > > b) We create a new cstate interface. For example: "cstate=<n>" > > Tuned then reads the appropriate > > /sys/devices/system/cpu/cpu0/cpuidle/state<n>/latency file to figure out > > what latency to specify. > > > > c) We then switch all the Red Hat tuned profiles that use > > "forced_latency=<n>" to use "cstate=<n>". It shouldn't be many. > > > > d) This would only be for RHEL-8. > > > > This is pretty important to resolve this quickly for RHEL-8. > > > > Thoughts? > > > > Thank you. > > Joe > > We have been thinking about it in the past. We could also extend the syntax > to allow C-state to be entered as a force_latency parameter, e.g.: > force_latency=10 # 10 us > ... > force_latency=C1 # C1 or less force_latency=C1 # for the state named C1 force_latency=state1 # for what kernel thinks is state 1 (In reply to Jaroslav Škarvada from comment #5) <snip> > > > > We have been thinking about it in the past. We could also extend the syntax > > to allow C-state to be entered as a force_latency parameter, e.g.: > > force_latency=10 # 10 us > > ... > > force_latency=C1 # C1 or less > > force_latency=C1 # for the state named C1 > force_latency=state1 # for what kernel thinks is state 1 I Jaroslav: I like that idea. If I understand correctly, there would be three options, and tuned would parse it to determine which one is being used: E.g: force_latency=10 force_latency=C1 force_latency=state1 This would be great. Thank you! Upstream commit: https://github.com/redhat-performance/tuned/commit/0ec40e036019c4c062d76a31d676565a09c615dd Maximal latency can be now specified multiple ways: - directly in usec (this is the same as before), e.g. for 10 us: force_latency = 10 - as an ID of maximal cstate allowed, e.g. for the kernel state1: force_latency = cstate.id:1 - as a name (case sensitive) of maximal cstate allowed, e.g. for the state named C1: force_latency = cstate.name:C1 It is also possible to specify multiple fallback values separated by '|', e.g.: force_latency = cstate.name:C6|cstate.id:4|10 This will try to obtain latency of cstate named C6, if it fails (e.g. there is no such cstate), it will try kernel state4 and if it also fails it finally fallbacks to 10 us. The upstream commit also changed force_latency settings of latency-performance profile to: force_latency=cstate.id:1|1 I.e. it tries kernel state1 and fallbacks to 1 us. We could use 'cstate.name:C1' to explicitly specify 'C1', but I think using kernel ID 'state1' is more generic - it means second C-state and doesn't care about it's name. Tuned obtains the latency information from the CPU0. Thanks Jaroslav: This is great. Thank you for getting to it so quickly. Clark and Luiz: The realtime/tuned.conf file includes from network-latency, which Jaroslav is fixing as part of this BZ. If anyone knows any other profiles that would need to be explicitly changed from "force_latency=1" to "force_latency=cstate.id:1", please holler. Jaroslav: Is there anything you need from me in order to get this into RHEL-8.1 ? Joe (In reply to Joe Mario from comment #12) > If anyone knows any other profiles that would need to be explicitly changed > from "force_latency=1" to "force_latency=cstate.id:1", please holler. > Regarding upstream Tuned profiles there are also: sap-hana - setting force_latency directly to 70 us virtual-host - setting force_latency directly to 70 us As nobody complained and I cannot directly match it with specific C-state (it's probably C3 max and not C4 and higher states - so it should work as it is) I didn't touch it. > Jaroslav: > Is there anything you need from me in order to get this into RHEL-8.1 ? > I think we are setup. I am now writing Beaker test. The errata will be created soon. Hi Jaroslav: I remember when Dave Dumas (cc'd) and I worked with SAP when they said their testing showed they got better hana performance with C3 than with C1. They identified a force_latency value of 70 as getting them C3. If you can change that to cstate.id:3, that would be great. I suspect virtual-host wanted cstate3 as well. I've cc'd Andrew Theurer to see if he can confirm. Thank you. Joe (In reply to Joe Mario from comment #14) > Hi Jaroslav: > I remember when Dave Dumas (cc'd) and I worked with SAP when they said their > testing showed they got better hana performance with C3 than with C1. They > identified a force_latency value of 70 as getting them C3. If you can > change that to cstate.id:3, that would be great. > > I suspect virtual-host wanted cstate3 as well. I've cc'd Andrew Theurer to > see if he can confirm. > Thank you. > Joe NP, I changed it both to: force_latency=cstate.id:3|70 Regarding the virtual-host, there is written in the comment: # Setting C3 state sleep mode/power savings So it seems C3 was intended. Hi Jaroslav: I wonder if you have a minor bug in the recent fix. It's not causing any problem, but it might. Looking at a RHEL-8.1 system, I see: # grep force_latency /lib/tuned/*/tuned.conf /lib/tuned/latency-performance/tuned.conf:force_latency=cstate.id:1|1 /lib/tuned/virtual-host/tuned.conf:force_latency=cstate.id:3|70 The force_latency values for virtual-host look fine. But shouldn't the force_latency values for latency-performance be "cstate.id:1|2" instead of "cstate.id:1|1"? Thank you. Joe (In reply to Joe Mario from comment #22) > Hi Jaroslav: > I wonder if you have a minor bug in the recent fix. It's not causing any > problem, but it might. > > Looking at a RHEL-8.1 system, I see: > # grep force_latency /lib/tuned/*/tuned.conf > /lib/tuned/latency-performance/tuned.conf:force_latency=cstate.id:1|1 > /lib/tuned/virtual-host/tuned.conf:force_latency=cstate.id:3|70 > > The force_latency values for virtual-host look fine. > But shouldn't the force_latency values for latency-performance be > "cstate.id:1|2" instead of "cstate.id:1|1"? > > Thank you. > Joe I wanted to stay backward compatible. But by looking on the table from the comment 1, it seems the worst exit latency for C1 is 3 us, shouldn't it be 3 us then? I.e.: cstate.id:1|3" When we come to conclusion on this I can fix it in upstream and it will get to RHEL by next rebase. I think it's not needed to fix it immediately by e.g. respin. Hi Jaroslav: I agree with you that setting the value to a 3 is better than a 2. I also agree that a respin is not needed. Thank you. Joe (In reply to Joe Mario from comment #24) > Hi Jaroslav: > I agree with you that setting the value to a 3 is better than a 2. > I also agree that a respin is not needed. > > Thank you. > Joe Upstream commit: https://github.com/redhat-performance/tuned/commit/252bd91ed0deeec5caf1d2a01c379145833707b7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3633 |