LTC Owner is: jstultz.com LTC Originator is: ankigarg.com This bug is to track the progress/status of rtcheck functionality on RHEL5 RT. rt_check has been integrated into RedHat test environment and some failures have been seen. The JVM currently checks to make sure it can run in real-time mode each time it starts up by running the rtcheck utility. rtcheck returns 0 on success, 1 on failure. A user can run 'rtcheck -v' to get a verbose report of the tests it runs. There are some failures and we are working at resolving these. Tim Burke has informed of some failures on running rt_check. so, if this bug gets mirrored, we could easily collaborate the efforts. The discussion on this happening on rhel-rt-external list.
Created attachment 152932 [details] rtcheck.c testcase
*** Bug 236425 has been marked as a duplicate of this bug. ***
----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-04-20 16:33 EDT ------- Darren requested a summary of what rtcheck.c actually checks, so here it is: 1) User ability (via pam config) to run sched_setscheduler w/ SCHED_RR (implicitly also SCHED_FIFO) 2) User ability to mlock ~32k of memory. 3) RLIMIT_MEMLOCK == RLIM_INFINITY 4) The pthread_mutex_init, pthread_mutexattr_getprotocol, and pthread_mutexattr_setprotocol symbols exist, implying robust mutex support. 5) clock_getres returns finer the 200us resolution. Currently rtcheck also looks for a specific kernel version, but provided the above, the kernel version check is not necessary.
We need more help here on this rtcheck thing. We can see what its doing, but... in its current state we aren't convinced its very useful. We need to identify for each of the failures whether its a valid test to include in the generic RHEL-RT product. For example, by default, RHEL has a resource limitation of less than 32K for mlock. As it stands we aren't changing the defualt for such configs. I don't think the rtcheck as it stands now does much useful, and almost everything fails. For the failures we should identify if those are really indicating *product* bugs. If so, then we should discuss the individual issues and create separate bugzillas to track what needs followup. Whereas if any of these are not really appropriate for the generic product set, then there's no need to include the test case.
Added the new version of rtcheck.c to RHTS results. This version still fails, Here is the test results: Test Start Time: Fri May 11 19:01:59 EDT 2007 RTCheck v0.5 - Linux Real-time Environment Checker -------------------------------------------------- Trying to lock memory: failed RLIMIT_MLOCK is 32768 Trying to request real-time scheduling: ok Checking for out-of-tree RT extensions: ok Checking for robust (PI) mutex kernel support: ok Checking for robust (PI) mutex glibc support: ok Checking for high resolution timers: ok Testing for acceptable clock resolution (<=200us): ok Some tests failed, exiting with status 1 rt_check Failed: Test End Time: Fri May 11 19:01:59 EDT 2007 My question is this, Unless I enable the rmem command line option this will always fail. Currently We test with defaults in RHTS. Since thie default is the opposite behavior then what the test is looking for it will always fail. So my only options are A.) Add the command line option in and test non default options. B.) Stop running a test that will always fail.
------- Additional Comments From jekacur.com 2007-05-14 09:46 EDT ------- (In reply to comment #16) > ----- Additional Comments From jburke 2007-05-14 08:41 EST ------- > Added the new version of rtcheck.c to RHTS results. This version still fails, > Here is the test results: > > Test Start Time: Fri May 11 19:01:59 EDT 2007 > RTCheck v0.5 - Linux Real-time Environment Checker > -------------------------------------------------- > Trying to lock memory: failed > RLIMIT_MLOCK is 32768 > Trying to request real-time scheduling: ok > Checking for out-of-tree RT extensions: ok > Checking for robust (PI) mutex kernel support: ok > Checking for robust (PI) mutex glibc support: ok > Checking for high resolution timers: ok > Testing for acceptable clock resolution (<=200us): ok > Some tests failed, exiting with status 1 > rt_check Failed: > Test End Time: Fri May 11 19:01:59 EDT 2007 > > My question is this, Unless I enable the rmem command line option this will > always fail. Currently We test with defaults in RHTS. Since thie default is the > opposite behavior then what the test is looking for it will always fail. > > So my only options are > A.) Add the command line option in and test non default options. > B.) Stop running a test that will always fail. > > > -- Are you sure you're a member of the real-time group? In /etc/security/limits.conf we add (note the memlock line) @realtime soft cpu unlimited @realtime - rt_priority 100 @realtime - nice 40 @realtime - memlock unlimited Root will actually fail the memlock test unless root is a member of the realtime group. I like this, it means in some respects you have more power as a regular user belonging to group realtime than root. (but not in all respects). In any case, my experience is you get the memlock fail if you're not a member of the group realtime.
Created attachment 154753 [details] rtcheck.c
----- Additional Comments From chavezt.com (prefers email at tinytim.com) 2007-05-15 13:27 EDT ------- An updated rtcheck.c that relies on the /proc interface This is not a final form of rtcheck.c - there is some discussion now in turning it into an init script and removing the dependency on /proc by performing other tests. More on this soon..
----- Additional Comments From chavezt.com (prefers email at tinytim.com) 2007-05-22 17:47 EDT ------- Oops, got trigger happy... anyway... If the /proc interface is enabled then the boot_id and result of the tests are written to the /var/cache/rtcheck file, for example. The "rtcheck" program will then rely on this data to return "True" or "False". If the /proc interface is not enabled, rather than running each test over again per-instantiation of the "rtcheck" program, another method can be devised to determine that the test results in the /var/cache/rtcheck file are valid (e.g. file creation time vs. uptime). One other thing that I forgot to mention in the last comment: the init script will be run from the "rc.local" file. However, I need to make sure that this is a distro-neutral approach.
----- Additional Comments From chavezt.com (prefers email at tinytim.com) 2007-05-22 17:41 EDT ------- The new "rtcheck" design will be in the form of an init script. However, I'm not sure I want place this in /etc/init.rd as this will not technically be a service, but something that will run once on boot. The "rtcheck" program will write to a file, /var/cache/rtcheck, for instance, with the return code of the "rtcheck" program and a boot_id. If the "boot_id" in this file is not the same as the one in /proc/sys/kernel/random/boot_id, the "rtcheck" program will be reran. If the /proc interface is not available, the "rtcheck" program will rerun every test, upon each instantiation. To test for the environment, the following tests will be done: Memory Lock: Verify user ability to mlock ~32K of memory. Is this actually a real-time requirement or a JVM requirement? If it's a JVM requirement, this should probably be removed from this program. Scheduler: Exercise the scheduler API to determine if it supports real-time; namely setting the scheduler to RR (which implies SCHED_FIFO)? CONFIG_PREEMPT_RT: Check to see if IRQ handlers are threads, if so, we're running CONFIG_PREEMPT_RT Robust Mutexes: Rather than looking up symbols, actually make the calls. This test should tell us if they exist both in glibc and in the kernel. This avoids having to use /proc/kallsyms or another system map to determine if the interface exists for the running kernel. High Res Timers: Anyone have ideas on this one? John? Presumably we'll want to sample some bogus work and take the average? What does the threshold need to be? Should this be tunable? One real-time app may require a higher resolution than another. Clock Resolution: Make sure the clock resolution is under ~200us. Should this also be tunable? Or, is this an acceptable limit for any real-time application? Currently we do this test simply by calling "clock_getres". This seems to be a suitable way to test this.
Created attachment 155906 [details] rtcheck-0.6-4pre1.tar.bz2
----- Additional Comments From chavezt.com (prefers email at tinytim.com) 2007-06-01 12:40 EDT ------- Latest version of rtcheck as of June 1st, 2007 This attachment contained the latest version of the 'rtcheck' program. See the CHANGELOG for details on what's new.
Created attachment 156473 [details] rtcheck-0.6-4.tar.bz2
----- Additional Comments From chavezt.com (prefers email at tinytim.com) 2007-06-07 11:35 EDT ------- Latest version of rtcheck as of June 7th, 2007 (now being packaged by Red Hat)
Attaching the first attempt at packaging rtcheck. Please consider renaming it to rtcheck-0.6.4.tar.bz2 and making the directory be rtchec-0.6.4, i.e. without the dash, to ease packaging as rpm expects the dash to be the %{release}, not part of %{version}.
Created attachment 156511 [details] rtcheck specfile
Created attachment 156512 [details] rtcheck initscript It uses -v, redirecting it to /var/log/rtcheck so that we can check the results when trying to diagnose problems.
changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|RH236962 - [FOCUS][Java |RH236962 - [Java Blocker] |Blocker] rtcheck on RHEL5 RT|rtcheck on RHEL5 RT ------- Additional Comments From dvhltc.com 2007-06-07 19:49 EDT ------- Removing [FOCUS] as we are just waiting on packaging now.
Using the latest version here is the result. I am not sure it is worth running this in the kernel testing. It looks as it will fail every time. RTCheck v0.6 - Linux Real-time Environment Checker -------------------------------------------------- Global tests: Retrieving cache validator: ok Validating cache /var/cache/rtcheck: failed cache file does not exist Using cached result status: failed result status is: -1 Re-running tests... Checking for out-of-tree RT extensions: ok Checking for robust (PI) mutex support: ok Testing for acceptable hrtimer resolution (<=20us): ok Testing for acceptable clock resolution (<=200us): ok Caching results in /var/cache/rtcheck: ok User-specific tests: Trying to lock memory: failed RLIMIT_MLOCK is 32768 Trying to request real-time scheduling: ok Some tests failed, exiting with status 1 rt_check Failed:
Jeff, see comment #6. If you put root in the "realtime" group and change /etc/security/limits.conf as described in the comment it should work every time and if it fails it will be even helpful to us, telling something is wrong with the kernel or some other RHEL5-RT component that makes the whole system not realtime capable. Perhaps the test should be run, at initscript time, as a different user, that is in the "realtime" group and is representative of typical realtime tasks running on the system?
----- Additional Comments From mauery.com (prefers email at vernux.com) 2007-06-18 10:23 EDT ------- The whole reason we separated out the user-specific stuff is that it doesn't get cached. Only the system characteristics (such as -RT kernel, high res timers, etc. get cached. mlock and scheduler are user permission dependent and are therefore not cached. This means that you can run it as root at boot time and cache the results and then run it later as a user in the realtime group and reap the benefits of the cached results without skewing the test results. You should only need to add root to the realtime group if that is the user you are running your tests as.
Understood. The rpm was packaged to take advantage of that, it runs rtcheck at boot time, caching the results. It also is run in verbose mode for the admin to see, in the /var/log/rtcheck file if some test failed in the last boot. Its just that now it is run as root and root is not on the "realtime" group, so some of the user specific tests fail. This is not a problem, but may confuse some admins. So it is suggested that the initscript run the tests as a non-root user in the "realtime" group, one that is in this group by default. This way the admin will know that all tests are passing for a user with the typical "realtime" profile. Another way is to not run the user specific tests in the initscript, but I think that at least testing it with a default "realtime" user in the "realtime" group would verify both the cacheable and noncacheable part for at least one user at system boot time, being a more comprehensive test.
changed: What |Removed |Added ---------------------------------------------------------------------------- Status|FIXEDAWAITINGTEST |TESTED ------- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-07-16 16:19 EDT ------- rtcheck package is now in the partner repo. Looks like it can be closed. One iffy bit is that rtcheck is in /sbin/ so its not in the normal path. Have to see if JVM needs to include that or we need to get a softlink. I'll open a new bug if necessary.
Closing this now that rtcheck is in the yum repo. There's still the matter of /sbin vs /usr/sbin, but thats being tracked by a separate bug #248557.