Bug 236962

Summary: [JAVA_BLOCKER] rtcheck on RHEL5 RT
Product: Red Hat Enterprise MRG Reporter: IBM Bug Proxy <bugproxy>
Component: realtime-kernelAssignee: Arnaldo Carvalho de Melo <acme>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: medium    
Version: 1.0CC: jburke, williams
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: early july Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-07-19 19:26:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
rtcheck.c testcase
none
rtcheck.c
none
rtcheck-0.6-4pre1.tar.bz2
none
rtcheck-0.6-4.tar.bz2
none
rtcheck specfile
none
rtcheck initscript none

Description IBM Bug Proxy 2007-04-18 17:39:43 UTC
LTC Owner is: jstultz.com
LTC Originator is: ankigarg.com


This bug is to track the progress/status of rtcheck functionality on RHEL5 RT.

rt_check has been integrated into RedHat test environment and some failures have
been seen.

The JVM currently checks to make sure it can run in real-time mode each time
it starts up by running the rtcheck utility.  rtcheck returns 0 on success, 1
on failure.  A user can run 'rtcheck -v' to get a verbose report of the tests
it runs. There are some failures and we are working at resolving these. 

Tim Burke has informed of some failures on running rt_check. so, if this bug
gets mirrored, we could easily collaborate the efforts.

The discussion on this happening on rhel-rt-external list.

Comment 1 IBM Bug Proxy 2007-04-18 17:39:43 UTC
Created attachment 152932 [details]
rtcheck.c testcase

Comment 2 Tim Burke 2007-04-19 01:58:00 UTC
*** Bug 236425 has been marked as a duplicate of this bug. ***

Comment 3 IBM Bug Proxy 2007-04-20 20:36:25 UTC
----- Additional Comments From jstultz.com (prefers email at johnstul.com)  2007-04-20 16:33 EDT -------
Darren requested a summary of what rtcheck.c actually checks, so here it is:
1) User ability (via pam config) to run sched_setscheduler w/ SCHED_RR
(implicitly also SCHED_FIFO)
2) User ability to mlock ~32k of memory.
3) RLIMIT_MEMLOCK == RLIM_INFINITY
4) The pthread_mutex_init, pthread_mutexattr_getprotocol, and
pthread_mutexattr_setprotocol symbols exist, implying robust mutex support.
5) clock_getres returns finer the 200us resolution.

Currently rtcheck also looks for a specific kernel version, but provided the
above, the kernel version check is not necessary. 

Comment 4 Tim Burke 2007-04-26 22:33:54 UTC
We need more help here on this rtcheck thing.  We can see what its doing, but...
in its current state we aren't convinced its very useful.  We need to identify
for each of the failures whether its a valid test to include in the generic
RHEL-RT product. For example, by default, RHEL has a resource limitation of less
than 32K for mlock.  As it stands we aren't changing the defualt for such configs.

I don't think the rtcheck as it stands now does much useful, and almost
everything fails. For the failures we should identify if those are really
indicating *product* bugs.  If so, then we should discuss the individual issues
and create separate bugzillas to track what needs followup.

Whereas if any of these are not really appropriate for the generic product set,
then there's no need to include the test case.



Comment 5 Jeff Burke 2007-05-14 12:41:33 UTC
Added the new version of rtcheck.c to RHTS results.  This version still fails,
Here is the test results:

Test Start Time: Fri May 11 19:01:59 EDT 2007
RTCheck v0.5 - Linux Real-time Environment Checker
--------------------------------------------------
Trying to lock memory: failed
	RLIMIT_MLOCK is 32768
Trying to request real-time scheduling: ok
Checking for out-of-tree RT extensions: ok
Checking for robust (PI) mutex kernel support: ok
Checking for robust (PI) mutex glibc support: ok
Checking for high resolution timers: ok
Testing for acceptable clock resolution (<=200us): ok
Some tests failed, exiting with status 1
rt_check Failed: 
Test End Time: Fri May 11 19:01:59 EDT 2007

My question is this, Unless I enable the rmem command line option this will
always fail. Currently We test with defaults in RHTS. Since thie default is the
opposite behavior then what the test is looking for it will always fail.

So my only options are 
A.) Add the command line option in and test non default options.
B.) Stop running a test that will always fail.


Comment 6 IBM Bug Proxy 2007-05-14 13:50:42 UTC
------- Additional Comments From jekacur.com  2007-05-14 09:46 EDT -------
(In reply to comment #16)
> ----- Additional Comments From jburke  2007-05-14 08:41 EST -------
> Added the new version of rtcheck.c to RHTS results.  This version still fails,
> Here is the test results:
> 
> Test Start Time: Fri May 11 19:01:59 EDT 2007
> RTCheck v0.5 - Linux Real-time Environment Checker
> --------------------------------------------------
> Trying to lock memory: failed
> 	RLIMIT_MLOCK is 32768
> Trying to request real-time scheduling: ok
> Checking for out-of-tree RT extensions: ok
> Checking for robust (PI) mutex kernel support: ok
> Checking for robust (PI) mutex glibc support: ok
> Checking for high resolution timers: ok
> Testing for acceptable clock resolution (<=200us): ok
> Some tests failed, exiting with status 1
> rt_check Failed: 
> Test End Time: Fri May 11 19:01:59 EDT 2007
> 
> My question is this, Unless I enable the rmem command line option this will
> always fail. Currently We test with defaults in RHTS. Since thie default is the
> opposite behavior then what the test is looking for it will always fail.
> 
> So my only options are 
> A.) Add the command line option in and test non default options.
> B.) Stop running a test that will always fail.
> 
> 
> -- 

Are you sure you're a member of the real-time group?
In /etc/security/limits.conf we add (note the memlock line)

@realtime       soft    cpu             unlimited
@realtime       -       rt_priority     100
@realtime       -       nice            40
@realtime       -       memlock         unlimited

Root will actually fail the memlock test unless root is a member of the realtime
group. I like this, it means in some respects you have more power as a regular
user belonging to group realtime than root. (but not in all respects). In any
case, my experience is you get the memlock fail if you're not a member of the
group realtime. 

Comment 7 IBM Bug Proxy 2007-05-15 17:30:46 UTC
Created attachment 154753 [details]
rtcheck.c

Comment 8 IBM Bug Proxy 2007-05-15 17:30:51 UTC
----- Additional Comments From chavezt.com (prefers email at tinytim.com)  2007-05-15 13:27 EDT -------
 
An updated rtcheck.c that relies on the /proc interface

This is not a final form of rtcheck.c - there is some discussion now in turning
it into an init script and removing the dependency on /proc by performing other
tests.	More on this soon.. 

Comment 9 IBM Bug Proxy 2007-05-22 21:50:46 UTC
----- Additional Comments From chavezt.com (prefers email at tinytim.com)  2007-05-22 17:47 EDT -------
Oops, got trigger happy... anyway...

If the /proc interface is enabled then the boot_id and result of the tests are
written to the /var/cache/rtcheck file, for example.  The "rtcheck" program will
then rely on this data to return "True" or "False".  If the /proc interface is
not enabled, rather than running each test over again per-instantiation of the
"rtcheck" program, another method can be devised to determine that the test
results in the /var/cache/rtcheck file are valid (e.g. file creation time vs.
uptime).  

One other thing that I forgot to mention in the last comment: the init script
will be run from the "rc.local" file.  However, I need to make sure that this is
a distro-neutral approach. 

Comment 10 IBM Bug Proxy 2007-05-23 17:37:54 UTC
----- Additional Comments From chavezt.com (prefers email at tinytim.com)  2007-05-22 17:41 EDT -------
The new "rtcheck" design will be in the form of an init script.  However, I'm
not sure I want place this in /etc/init.rd as this will not technically be a
service, but something that will run once on boot.  The "rtcheck" program will
write to a file, /var/cache/rtcheck, for instance, with the return code of the
"rtcheck" program and a boot_id.  If the "boot_id" in this file is not the same
as the one in /proc/sys/kernel/random/boot_id, the "rtcheck" program will be
reran.  If the /proc interface is not available, the "rtcheck" program will
rerun every test, upon each instantiation.  To test for the environment, the
following tests will be done:

Memory Lock:

Verify user ability to mlock ~32K of memory.  Is this actually a real-time
requirement or a JVM requirement?  If it's a JVM requirement, this should
probably be removed from this program.

Scheduler:

Exercise the scheduler API to determine if it supports real-time; namely
setting the scheduler to RR (which implies SCHED_FIFO)?

CONFIG_PREEMPT_RT:

Check to see if IRQ handlers are threads, if so, we're running CONFIG_PREEMPT_RT

Robust Mutexes:

Rather than looking up symbols, actually make the calls.  This test should
tell us if they exist both in glibc and in the kernel.  This avoids having
to use /proc/kallsyms or another system map to determine if the interface
exists for the running kernel.

High Res Timers:

Anyone have ideas on this one?  John?  Presumably we'll want to sample some
bogus work and take the average?  What does the threshold need to be?  Should
this be tunable?  One real-time app may require a higher resolution than 
another.

Clock Resolution:

Make sure the clock resolution is under ~200us.  Should this also be tunable?
Or, is this an acceptable limit for any real-time application?  Currently we
do this test simply by calling "clock_getres".  This seems to be a suitable
way to test this. 

Comment 11 IBM Bug Proxy 2007-06-01 16:45:48 UTC
Created attachment 155906 [details]
rtcheck-0.6-4pre1.tar.bz2

Comment 12 IBM Bug Proxy 2007-06-01 16:45:55 UTC
----- Additional Comments From chavezt.com (prefers email at tinytim.com)  2007-06-01 12:40 EDT -------
 
Latest version of rtcheck as of June 1st, 2007

This attachment contained the latest version of the 'rtcheck' program.	See the
CHANGELOG for details on what's new. 

Comment 13 IBM Bug Proxy 2007-06-07 15:41:17 UTC
Created attachment 156473 [details]
rtcheck-0.6-4.tar.bz2

Comment 14 IBM Bug Proxy 2007-06-07 15:41:34 UTC
----- Additional Comments From chavezt.com (prefers email at tinytim.com)  2007-06-07 11:35 EDT -------
 
Latest version of rtcheck as of June 7th, 2007 (now being packaged by Red Hat) 

Comment 15 Arnaldo Carvalho de Melo 2007-06-07 21:13:23 UTC
Attaching the first attempt at packaging rtcheck. Please consider renaming it to
rtcheck-0.6.4.tar.bz2 and making the directory be rtchec-0.6.4, i.e. without the
dash, to ease packaging as rpm expects the dash to be the %{release}, not part
of %{version}.

Comment 16 Arnaldo Carvalho de Melo 2007-06-07 21:14:34 UTC
Created attachment 156511 [details]
rtcheck specfile

Comment 17 Arnaldo Carvalho de Melo 2007-06-07 21:15:53 UTC
Created attachment 156512 [details]
rtcheck initscript

It uses -v, redirecting it to /var/log/rtcheck so that we can check the results
when trying to diagnose problems.

Comment 18 IBM Bug Proxy 2007-06-07 23:55:37 UTC
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|RH236962 - [FOCUS][Java     |RH236962 - [Java Blocker]
                   |Blocker] rtcheck on RHEL5 RT|rtcheck on RHEL5 RT




------- Additional Comments From dvhltc.com  2007-06-07 19:49 EDT -------
Removing [FOCUS] as we are just waiting on packaging now. 

Comment 19 Jeff Burke 2007-06-18 12:20:12 UTC
Using the latest version here is the result. I am not sure it is worth running
this in the kernel testing. It looks as it will fail every time.


RTCheck v0.6 - Linux Real-time Environment Checker
--------------------------------------------------
Global tests:
  Retrieving cache validator: ok
  Validating cache /var/cache/rtcheck: failed
	cache file does not exist
  Using cached result status: failed
	result status is: -1
  Re-running tests...
  Checking for out-of-tree RT extensions: ok
  Checking for robust (PI) mutex support: ok
  Testing for acceptable hrtimer resolution (<=20us): ok
  Testing for acceptable clock resolution (<=200us): ok
  Caching results in /var/cache/rtcheck: ok
User-specific tests:
  Trying to lock memory: failed
	RLIMIT_MLOCK is 32768
  Trying to request real-time scheduling: ok
Some tests failed, exiting with status 1
rt_check Failed: 



Comment 20 Arnaldo Carvalho de Melo 2007-06-18 13:32:05 UTC
Jeff, see comment #6. If you put root in the "realtime" group and change
/etc/security/limits.conf as described in the comment it should work every time
and if it fails it will be even helpful to us, telling something is wrong with
the kernel or some other RHEL5-RT component that makes the whole system not
realtime capable.

Perhaps the test should be run, at initscript time, as a different user, that is
in the "realtime" group and is representative of typical realtime tasks running
on the system?

Comment 21 IBM Bug Proxy 2007-06-18 14:26:16 UTC
----- Additional Comments From mauery.com (prefers email at vernux.com)  2007-06-18 10:23 EDT -------
The whole reason we separated out the user-specific stuff is that it doesn't get
cached.  Only the system characteristics (such as -RT kernel, high res timers,
etc. get cached.  mlock and scheduler are user permission dependent and are
therefore not cached.

This means that you can run it as root at boot time and cache the results and
then run it later as a user in the realtime group and reap the benefits of the
cached results without skewing the test results.   You should only need to add
root to the realtime group if that is the user you are running your tests as. 

Comment 22 Arnaldo Carvalho de Melo 2007-06-18 15:18:21 UTC
Understood. The rpm was packaged to take advantage of that, it runs rtcheck at
boot time, caching the results. It also is run in verbose mode for the admin to
see, in the /var/log/rtcheck file if some test failed in the last boot. 

Its just that now it is run as root and root is not on the "realtime" group, so
some of the user specific tests fail. This is not a problem, but may confuse
some admins.

So it is suggested that the initscript run the tests as a non-root user in the
"realtime" group, one that is in this group by default.

This way the admin will know that all tests are passing for a user with the
typical "realtime" profile.

Another way is to not run the user specific tests in the initscript, but I think
that at least testing it with a default "realtime" user in the "realtime" group
would verify both the cacheable and noncacheable part for at least one user at
system boot time, being a more comprehensive test.

Comment 24 IBM Bug Proxy 2007-07-16 20:25:38 UTC
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|FIXEDAWAITINGTEST           |TESTED




------- Additional Comments From jstultz.com (prefers email at johnstul.com)  2007-07-16 16:19 EDT -------
rtcheck package is now in the partner repo. Looks like it can be closed. 

One iffy bit is that rtcheck is in /sbin/ so its not in the normal path. Have to
see if JVM needs to include that or we need to get a softlink. I'll open a new
bug if necessary. 

Comment 25 Tim Burke 2007-07-19 19:26:06 UTC
Closing this now that rtcheck is in the yum repo.  There's still the matter of
/sbin vs /usr/sbin, but thats being tracked by a separate bug #248557.