Bug 1393413

Summary: rlWatchdog does not work
Product: [Fedora] Fedora Reporter: Tereza Cerna <tcerna>
Component: beakerlibAssignee: Dalibor Pospíšil <dapospis>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: azelinka, dapospis, jprokes, mkyral, muller
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-12 14:23:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Tereza Cerna 2016-11-09 13:34:46 UTC
I need to check if command ends normally or if it stay as zombie and ends with timeout after 10 minutes. So I decided to use function rlWatchdog.

There is written in beakerlib manual:
"Returns 0 if the command ends normally, without need to be killed."

But it does not work. It ends by signal KILL and there is no result of my command, no exit code 1.


my source code:
===============
rlPhaseStart
    rlRun "tuned-adm list | grep nonexistentprofile" 1
    rlWatchdog "tuned-adm profile nonexistentprofile" 60
rlPhaseEnd


result of the test phase:
=========================
:: [  BEGIN   ] :: Running 'tuned-adm list | grep nonexistent'
:: [   PASS   ] :: Command 'tuned-adm list | grep nonexistent' (Expected 1, got 1)
:: [ 08:31:21 ] :: Runnning tuned-adm profile nonexistent, with 60 seconds timeout
Requested profile 'nonexistent' doesn't exist.
:: [ 08:32:21 ] :: Command is still running, I am killing it with KILL
/usr/share/beakerlib/testing.sh: line 851: 15865 Killed                  eval "$command; touch __INTERNAL_FINISHED"


# rpm -q tuned
tuned-2.7.1-3.el7.noarch

Comment 1 Dalibor Pospíšil 2016-11-10 17:34:17 UTC
This one is a bit tricky. rlWatchdog returns 0 if the command finished within the specified time and 1 if the 'kill' had to be used. There's no way to distinguish return code for the command and for the killing. If you would like to get the command and the watchdog asserted you should use following construction:

rlRun "rlWatchdog \"rlRun \\\"command\\\"\""

The inner rlRun asserts the command, and the outer rlRun asserts rlWatchdog function. Of course there are other possibilities how to write this construction on more lines.

The only issue I can see here is the documentation is a bit confusing. The meaning of 'ends normally' here is 'is not killed'.

Feel free to reopen this bug if you consider it still buggy.

Comment 2 Tereza Cerna 2016-11-11 14:29:06 UTC
I should reopen it. :-)

Result of command does not interest me. I'm interested in result of rlWatchdog.

------------

I have two cases:

1] When I run command 'tuned-adm profile nonexistentprofile' on this version of tuned, it don't stop by itself, but wait for timeout. I need to check if it ends by itself or not.

:: [ 08:31:21 ] :: Runnning tuned-adm profile nonexistent, with 60 seconds timeout
Requested profile 'nonexistent' doesn't exist.
:: [ 08:32:21 ] :: Command is still running, I am killing it with KILL
/usr/share/beakerlib/testing.sh: line 851: 15865 Killed                  eval "$command; touch __INTERNAL_FINISHED"

# rpm -q tuned
tuned-2.7.1-3.el7.noarch

------------

2] There is case with new tuned version and mentioned command ends by itself and does not need timeout.

:: [ 09:23:34 ] :: Runnning tuned-adm profile nonexistentprofile, with 60 seconds timeout
Requested profile 'nonexistentprofile' doesn't exist.
:: [ 09:23:35 ] :: Command ended itself, I am not killing it.
/usr/share/beakerlib/testing.sh: line 851: 17714 Terminated              eval "sleep $timeout; touch __INTERNAL_TIMEOUT"

# rpm -q tuned
tuned-2.7.1-3.el7_3.1.noarch

------------

Source code:

rlPhaseStart
    rlWatchdog "tuned-adm profile nonexistentprofile" 60
rlPhaseEnd

------------

You can see that there is no result PASS/FAIL of rlWatchdog.

Comment 3 Tereza Cerna 2016-11-11 14:30:26 UTC
I forgot to mention.

In case 1], I except result of rlWatchdog FAIL, because it should kill command inside.

In case 2], I except result of rlWatchdog PASS, because it ends correctly.

Comment 4 Dalibor Pospíšil 2016-11-12 14:23:33 UTC
Still, I do not see any bug here. According to log messages and even according to return code it works correctly:

[0 root@sopos-rhel6-brq ~]# rlWatchdog 'sleep 20' 10; echo $?
:: [ 17:06:10 ] :: Runnning sleep 20, with 10 seconds timeout
[1] 11051
[2] 11052
[2]+  Done                    eval "sleep $timeout; touch __INTERNAL_TIMEOUT"
:: [ 17:06:20 ] :: Command is still running, I am killing it with KILL
[1]+  Killed                  eval "$command; touch __INTERNAL_FINISHED"
1
[0 root@sopos-rhel6-brq ~]# rlWatchdog 'sleep 5' 10; echo $?
:: [ 17:06:27 ] :: Runnning sleep 5, with 10 seconds timeout
[1] 11084
[2] 11085
[1]-  Done                    eval "$command; touch __INTERNAL_FINISHED"
:: [ 17:06:32 ] :: Command ended itself, I am not killing it.
[2]+  Terminated              eval "sleep $timeout; touch __INTERNAL_TIMEOUT"
0

As I've written in comment 1, if you want to assert it, use rlRun around the rlWatchdog or simply use 'rlAssert0 "comment" $?' right after the rlWatchdog.