I see those tests failing on jdk6 and 7 on OS/X Failed tests: testProcessInfoAccurateAfterProcessRestart(org.rhq.core.pc.inventory.getnativeprocess.NativeProcessRetrievalTest): The process info should have refreshed, before= 30790, after= 30790 testProcessInfoAccurateAfterProcessStarted(org.rhq.core.pc.inventory.getnativeprocess.NativeProcessRetrievalTest): The process info should have been nulled out expected:<0> but was:<30790> testProcessInfoAccurateWhenProcessStopped(org.rhq.core.pc.inventory.getnativeprocess.NativeProcessRetrievalTest): The process info should have refreshed expected:<0> but was:<30790> It looks to me like inside ProcessInfo when we detect that a process is dead, we still return a pid for it, but throw lots of warnings to the log. Fixing that yields Failed tests: testProcessInfoAccurateAfterProcessRestart(org.rhq.core.pc.inventory.getnativeprocess.NativeProcessRetrievalTest): Only a single discovery call should have been made to refresh the process info expected:<3> but was:<2> testProcessInfoAccurateAfterProcessStarted(org.rhq.core.pc.inventory.getnativeprocess.NativeProcessRetrievalTest): Exactly 1 discovery call should have been made to refresh the process info after the process started again. expected:<4> but was:<3> From stepping through it, I only see the discovery component being called at the start of the method, but not after the stopProcess() startProcess() combo.
These tests check the behavior of ResourceContext.getNativeProcess(). That method should, in case when it finds the original process no longer running, do a fresh process scan and run the discovery with that scan. If the discovery scan finds a resource with the same resource key as the current resource, the corresponding process info is cached in the resource context and returned to the caller. The tests take advantage of that behavior to track if the method actually found the original process dead and refreshed. So if you're not seeing the correct number of discoveries, I'd assume that Sigar is not reporting the process as expected or your fix (what it actually is?) changed the behavior of ProcessInfo.isRunning() and ProcessInfo.refresh() (see ResourceContext.isRediscoveryRequired() method on how we determine we need a rediscovery).
The tests seem to work with both java6 and 7 on Linux. I assume this to be a problem with Sigar and OS X. Note that the NativeProcessRetrievalTest#stopTestProcess() has a hardcoded wait of 2s to give Sigar enough time to detect the process info changes. Maybe this value needs to be greater on OS X? But again this is a platform specific issue that I can't help you with having no access to that platform.
Btw. what you meant by "Fixing that yields"?
The timeout change did not change behavior. If this is an OS/X specific issue (which may be possible), then we need to make sure, this test only runs on non-OS/X platforms. Have a log at the logfile when the test runs - some of the discovery code just does not re-try running a scan, as the pid is >0, so it is assumed to be valid. E.g. ProciesInfo.update() does try { procExe = sigar.getProcExe(pid); } catch (Exception e) { handleSigarCallException(e, "getProcExe"); } and then just continues. If we end up in the catch block, the underlying process is gone. Still we continue as nothing had happened. Which is not true and the log afterwards is full of such error messages. What I meant with "yields" is that once I fixed the above and got the native stuff into recognizing that when a process no longer exists and is setting the pid to 0 (which triggers a scan on next call), I get the issues with the discovery count, as I do not see this discovery being called between stop() and start() calls.
Hmm... the code in ProcessInfo is non-trivial and not having access to your changes nor having the ability to reproduce, I can't say what's going on exactly. As I said already, you need to look at ResourceContext.isRediscoveryRequired() - this is where the code decides whether to reinvoke the discovery to align with the latest process scan or not.