Bug 1013225 - CONFIG_SCHEDSTATS isn't enabled due to performance impact (systemd-bootchart generates no bootcharts)
Summary: CONFIG_SCHEDSTATS isn't enabled due to performance impact (systemd-bootchart ...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1026506 1046021 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-28 13:56 UTC by Simon Gerhards
Modified: 2015-09-04 03:23 UTC (History)
33 users (show)

Fixed In Version: 4.2.0-1.fc23
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-09-04 03:23:40 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
systemd-bootchart.strace (11.87 KB, text/plain)
2013-10-09 17:30 UTC, Artur Szymczak
no flags Details
cs.c (786 bytes, text/plain)
2015-08-25 17:20 UTC, Josh Poimboeuf
no flags Details
cs.c (667 bytes, text/plain)
2015-08-25 17:22 UTC, Josh Poimboeuf
no flags Details

Description Simon Gerhards 2013-09-28 13:56:18 UTC
Description of problem:
systemd-bootchart generates no (empty) bootcharts. The journal does not contain a message with MESSAGE_ID=9f26aa562cf440c2b16c773d0479b518.

Version-Release number of selected component (if applicable):
207-4.fc20

Steps to Reproduce:
1. Boot with init=/usr/lib/systemd/systemd-bootchart
2. Check the size of the generated svg in /run/log

Actual results:
The svg file is empty (size: 0 bytes).

Expected results:
A bootchart should have been generated and stored in /run/log.

Additional info:
I have reproduced this on a minimal install done from the F20 alpha netinstall iso.

Comment 1 Artur Szymczak 2013-10-09 17:27:58 UTC
I checked my system, and I have the same problem. But when I run systemd-bootchart from command line, then I see:
open /proc/schedstat: No such file or directory

The same message is shown during boot.

Comment 2 Artur Szymczak 2013-10-09 17:30:33 UTC
Created attachment 810072 [details]
systemd-bootchart.strace

Comment 3 Harald Hoyer 2013-10-28 09:02:58 UTC
# fgrep CONFIG_SCHEDSTATS /boot/config-*
/boot/config-3.12.0-0.rc3.git5.2.fc21.x86_64:# CONFIG_SCHEDSTATS is not set
/boot/config-3.12.0-0.rc4.git1.2.fc21.x86_64:# CONFIG_SCHEDSTATS is not set
/boot/config-3.12.0-0.rc6.git0.2.fc21.x86_64:# CONFIG_SCHEDSTATS is not set

Comment 4 Josh Boyer 2013-10-28 13:15:07 UTC
We switched to enabling that only on debug builds a while ago.  It seems that was turned off entirely with the final 3.11.0 build and has remained off since.  Internal testing shows the option has a non-trivial performance impact for context switches.

We can turn this on in debug kernels again, but I'm not sure it's worthwhile.  Given that there are other debug options enabled which slow things down even more, all a bootchart would show there is slow things being slow.  It isn't typically what someone wants to measure.

Comment 6 Josh Boyer 2013-11-04 20:26:25 UTC
*** Bug 1026506 has been marked as a duplicate of this bug. ***

Comment 7 Josh Boyer 2013-11-04 20:27:51 UTC
Josh P., can you elaborate a bit on some of the scheduling performance impacts you saw during your measurements?

Comment 8 Josh Poimboeuf 2013-11-04 21:36:35 UTC
In my tests I did a lot of context switches under various CPU loads.  I saw a ~5-10% drop in average context switch speed when CONFIG_SCHEDSTATS was enabled.  It varied depending on number of CPUs, CPU load, kernel version, and other kernel config options.

The performance hit only seemed to happen on post-CFS kernels (>= 2.6.23).  The previous O(1) scheduler didn't seem to have this issue.

Comment 9 Andi Kleen 2013-11-04 22:35:46 UTC
That's odd. I'll take a look. it shouldn't be that expensive.

Comment 10 Josh Boyer 2013-12-24 14:19:19 UTC
*** Bug 1046021 has been marked as a duplicate of this bug. ***

Comment 11 Jean-François Fortin Tam 2014-01-07 03:25:15 UTC
So how do I, as a user, turn this feature on at runtime (or boot-time)?

I need this to try bootcharting the GNOME login as part of https://bugzilla.gnome.org/show_bug.cgi?id=645756

Comment 12 Josh Boyer 2014-01-07 12:45:33 UTC
I don't believe it is something that can be enabled at runtime.  You'd need to rebuild a kernel with the config option set.

Comment 13 snark2004-first 2014-01-11 09:56:49 UTC
Having just also having been forced to compile a custom kernel, I consider this seriously retrograde. I happened to want latencytop which I had used used previously on Fedora.

What is the target audience of Fedora these days ? - the build created the perf tools, but we don't have low latency kernel, or latencytop ?. Where's the logic ?.
Decisions by the developers are making it hard to stay committed to Fedora.

Comment 14 john.haxby@oracle.com 2014-01-11 12:41:34 UTC
It's turned off because of the performance impact and, according to comment #9, under investigation.

Comment 15 Josh Boyer 2014-04-15 18:58:16 UTC
Has anyone actually gotten around to investigating why the performance impact of this option is noticable?  Josh, Andi?

Comment 16 Andi Kleen 2014-04-15 19:46:03 UTC
Tim took a look I believe. Unfortunately nothing conclusive. Maybe we ran the wrong workload.

It would be good to have some function traces from a workload that shows slow downs with it own.

Comment 17 Jean-François Fortin Tam 2014-10-14 04:19:20 UTC
Can you guys reevaluate this for Fedora 21 workstation?

It's quite frustrating to not be able to provide the requested information to upstream GNOME because of this; GNOME users get to pay the price of eternally bad login performance because the upstream issue cannot get investigated without this profiling information.

Comment 18 Eduard Vopicka 2014-12-04 10:27:28 UTC
Tis is just to notice the bug appears to be still there, e.g. booting with init=/usr/lib/systemd/systemd-bootchart leaves empty svg file in /run/log. And yes, the "open /proc/schedstat: No such file or directory" can be seen during boot.

Please is there any chance that this will be fixed? My system boots extremely slowly due to reason unknown at the time, so the bootchart graph should help greatly.

Thanks,

Ed

Comment 19 Josh Boyer 2014-12-04 12:35:25 UTC
It's not a bug, it's a choice that's been made to disable the config option.  You can build a kernel with it set fairly easily if you need to.

Comment 20 john.haxby@oracle.com 2014-12-04 13:26:51 UTC
This is a bug though:

$ latencytop
Failed to open /proc/latency_stats: No such file or directory
Please enable the CONFIG_LATENCYTOP configuration in your kernel.

That was bug 1046021 which was closed as a duplicate of this.

One can also argue that systemd-bootchart not working is also a bug (though I'm not inclined to do so).

Isn't it time that CONFIG_SCHEDSTATS was looked at again?  And CONFIG_LATENCYTOP?  (I notice that CONFIG_LATENCYTOP=y for RHEL7.)

Perhaps suggestions as to what to test for (and how) so that people can report back here about the impact (if any)?

Comment 21 Josh Boyer 2014-12-04 14:02:33 UTC
(In reply to john.haxby from comment #20)
> This is a bug though:
> 
> $ latencytop
> Failed to open /proc/latency_stats: No such file or directory
> Please enable the CONFIG_LATENCYTOP configuration in your kernel.
> 
> That was bug 1046021 which was closed as a duplicate of this.
> 
> One can also argue that systemd-bootchart not working is also a bug (though
> I'm not inclined to do so).
> 
> Isn't it time that CONFIG_SCHEDSTATS was looked at again?  And
> CONFIG_LATENCYTOP?  (I notice that CONFIG_LATENCYTOP=y for RHEL7.)

I'm not aware of CONFIG_LATENCYTOP=y being set in the RHEL7 kernel.  If it is, it would select SCHEDSTATS and that would be enabled as well.  That would contradict the entire reasoning behind it being disabled in Fedora, given that it was found it cause the issues in RHEL.

Can you point me to which RHEL7 kernel RPM has LATENCYTOP enabled?

Josh, are you aware of any change on the RHEL side of things here?

Comment 22 Josh Poimboeuf 2014-12-04 14:46:32 UTC
(In reply to Josh Boyer from comment #21)
> Josh, are you aware of any change on the RHEL side of things here?

No. LATENCYTOP and SCHEDSTATS are (and always have been) both disabled on the RHEL7 production kernel.

They are however both (and always have been) enabled on the RHEL7 debug kernel.

Comment 23 john.haxby@oracle.com 2014-12-04 14:53:44 UTC
Sorry, eyes not tracking properly, too many kernels installed:

CONFIG_HAVE_LATENCYTOP_SUPPORT=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_LATENCYTOP is not set

That's the current RHEL7 kernel.   So far as I can tell, though, RHEL7 also ships latencytop.   Does this mean that latencytop is only intended to work with the debug kernel?  (In my experience, running the debug kernel to test for performance is not a good move, it affects performance too much.)

Comment 24 Eduard Vopicka 2014-12-04 15:23:50 UTC
One question is if CONFIG_SCHEDSTAT and CONFIG_LATENCYTOP should be enabled or not by default.

In my opinion, the bigger problem with CONFIG_SCHEDSTAT and CONFIG_LATENCYTOP disabled is that the end result is empty bootrchart.svg. The end result should be some explanatory message in bootchart.svg, like Please recompile and reinstall your kernel with THIS and THAT option enabled. Or this requirement should be documented in manpage, wiki etc.

Ed

Comment 25 Jan Schreiber 2015-08-20 18:50:59 UTC
Hello folks!

Would like to revitalize this discussion around the change of disabling CONFIG_SCHEDSTAT.

I am a performance consultant, working on all sorts of commercial application performance issues. The _complete_ metrics under /proc/<PID>/task/<TID>/sched are, in my opinion, invaluable. 

One can immediately check for the severity of a CPU bottleneck, estimate IO waiting or the likelihood of priority inversion.

Other distributions still do provide these metrics, like e.g. SLES 11/12.

Could you please explain where you saw the performance impact when having CONFIG_SCHEDSTAT activated.

Thanks,

Jan

Comment 26 Josh Poimboeuf 2015-08-25 17:20:25 UTC
Created attachment 1066951 [details]
cs.c

Jan,

Try the attached (crude) microbenchmark and run like this:

  perf stat -e cs ./cs 5 100

That will spawn 100 threads which call sched_yield() in a tight loop for 5 seconds.  I think perf will generally report fewer context switches when CONFIG_SCHEDSTATS is enabled.

That said, I'm not really convinced that this microbenchmark corresponds to a sane real world usage scenario.

Also, given the number of people who have complained about latencytop being disabled and systemd-bootchart being broken, it might not be worth the tradeoff.

Comment 27 Josh Poimboeuf 2015-08-25 17:22:24 UTC
Created attachment 1066952 [details]
cs.c

Comment 28 Andi Kleen 2015-08-25 17:41:45 UTC
FWIW we did some tests and couldn't measure a difference with CONFIG_SCHEDSTATS. Also in theory the code shouldn't have much impact. Unless it can be measured in something a bit more macro it would be good to consider re-enabling it again, as it's very useful. 

There's the concept of performance you're losing by not having the right tools to improve performance. And latencytop and other SCHEDSTATS based tools have a lot of potential here.

Comment 29 Jan Schreiber 2015-08-25 21:10:36 UTC
Thanks for these positive thoughts and feedback.

So how can we convince the decision makers at RedHat re-enabling CONFIG_SCHEDSTATS again?

As I am new to this whole process, I would like to mention that I am looking for RHEL 7.x onwards. Not Fedora. Is this still the correct place to beg? Or would I need to file another ER?

Comment 30 Josh Boyer 2015-08-25 21:22:42 UTC
(In reply to Jan Schreiber from comment #29)
> Thanks for these positive thoughts and feedback.
> 
> So how can we convince the decision makers at RedHat re-enabling
> CONFIG_SCHEDSTATS again?

We can enable it in Fedora whenever.

> As I am new to this whole process, I would like to mention that I am looking
> for RHEL 7.x onwards. Not Fedora. Is this still the correct place to beg? Or
> would I need to file another ER?

You would need to file a bug against the RHEL7 kernel making that request.  I personally have no insight into how likely it will be granted.

Comment 31 Jan Schreiber 2015-08-25 21:47:42 UTC
Just filed https://bugzilla.redhat.com/show_bug.cgi?id=1256961.

Feel free to post your comments there, if your concern is the issue described above under RHEL 7.x.

Comment 32 snark2004-first 2015-08-25 23:25:30 UTC
Thanks for revitalising this Jan - I had given up in frustration. Couldn't believe it still wasn't enabled in F22.
For the devs, please consider this a vote for the change ASAP. Nice to see some potential for light at the end of the tunnel.

Comment 33 Josh Boyer 2015-08-26 12:46:45 UTC
I've enabled the options in f23 and rawhide.  The rc8-git1 kernel will have them set.

We'll look at the stable releases later.

Comment 34 snark2004-first 2015-08-27 03:13:38 UTC
Thank you Josh, that just dropped on the rawhide nodebug repo. _much_ better.

Comment 35 Fedora Update System 2015-09-01 15:00:00 UTC
kernel-4.2.0-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2015-14782

Comment 36 Fedora Update System 2015-09-01 20:22:07 UTC
kernel-4.2.0-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.\nIf you want to test the update, you can install it with \n su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-14782

Comment 37 Fedora Update System 2015-09-04 03:23:25 UTC
kernel-4.2.0-1.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.