Bug 743597

Summary: rlWatchdog fails to handle commands waiting for an input from stdin
Product: [Fedora] Fedora Reporter: David Kutálek <dkutalek>
Component: beakerlibAssignee: Petr Muller <pmuller>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: mfranc, ohudlick, pmuller, psplicha, rrakus
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-10 14:55:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Patch to monitor state of command eval and put it in foreground if it is stopped. none

Description David Kutálek 2011-10-05 13:37:16 UTC
Created attachment 526493 [details]
Patch to monitor state of command eval and put it in foreground if it is stopped.

Description of problem:

Consider e.g. simple usage like:
rlWatchdog "php -f ./$REPRODUCER" 5
When php crashes (which it does in my case), rlWatchdog does not detect it:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: Test
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [15:06:12] ::  Runnning php -f ./reproducer.php, with 5 seconds timeout
:: [15:06:17] ::  Command is still running, I am killing it with KILL
/usr/share/beakerlib//testing.sh: line 785:  9829 Killed                  eval "$command; touch __INTERNAL_FINISHED"

The problem here is in the fact that in mentioned eval, when first part segfaults, bash stops whole eval, therefore not proceeding with touching the file.

when using set +m, bash does not stop eval, but it probably brings other problems.

Interestingly after bg command process is stopped again.
After fg command eval run to its end.

One more interesting behaviour: reproducer output here is not printed to stdout till fg command, although it printed something before crash. So bash is holding somehow output buffer. Urgh isn't it some bash problem afterall???

So, lets investigate whether bash behaviour is correct.
For a workaround which works for me see attached patch.

Version-Release number of selected component (if applicable):


How reproducible:

Always

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Miroslav Franc 2011-10-11 09:11:23 UTC
I would say this is php bug (libedit?). It tries to do some terminal initialisation while in background. `php -f reproducer.php </dev/null' fixes the problem for me. Question is whether it makes sense to do something about where descriptors point to inside rlWatchdog.

Comment 3 David Kutálek 2011-10-11 10:02:13 UTC
(In reply to comment #2)
> I would say this is php bug (libedit?). It tries to do some terminal
> initialisation while in background. `php -f reproducer.php </dev/null' fixes
> the problem for me. Question is whether it makes sense to do something about
> where descriptors point to inside rlWatchdog.

Ah you are right, this behaviour happens also for not-segfaulting simple php script containing just one print statement.

Comment 4 Miroslav Franc 2011-10-11 11:42:46 UTC
(In reply to comment #3)

Not only that, you can have empty script and still still happens, actually it happens before any script is even opened and read.

Comment 5 David Kutálek 2011-10-11 12:16:43 UTC
(In reply to comment #2)
> I would say this is php bug (libedit?). It tries to do some terminal
> initialisation while in background. `php -f reproducer.php </dev/null' fixes
> the problem for me. Question is whether it makes sense to do something about
> where descriptors point to inside rlWatchdog.

Lets abstract from php. It seems any stdin causes it to stop - and somehow it makes sense, isn't it?

# rlWatchdog "echo tak to napis ; read napis ; echo diky diky" 60
:: [13:45:30] ::  Runnning echo tak to napis ; read napis ; echo diky diky, with 60 seconds timeout
Argh Blargh!: 2
[1] 27261
[2] 27262
tak to napis

[1]+  Stopped                 eval "$command; touch __INTERNAL_FINISHED"

 OR JUST:

[root@timothy 11948]# eval "echo tak to napis ; read napis ; echo diky diky" &
[1] 27454
tak to napis
[root@timothy 11948]# 

[1]+  Stopped                 eval "echo tak to napis ; read napis ; echo diky diky"

---

So I would say, in this state rlWatchdog does not work for any command requesting input from stdin. Not so good and not so bad, beakerlib user can (if know about it!) add dev null redirection by himself.

Assuming we are doing automated testing, interactive input should not be needed anyway anytime. So what about changing eval in rlWatchdog from:

eval "$command; touch __INTERNAL_FINISHED" &

to 

eval "$command; touch __INTERNAL_FINISHED" < /dev/null &

We cannot add redirection inside eval, since $command can be structured in any way. Does this makes sense?

Comment 6 Roman Rakus 2011-10-26 12:12:13 UTC
(In reply to comment #5)

> Lets abstract from php. It seems any stdin causes it to stop - and somehow it
> makes sense, isn't it?
Better would be some option to php to not do anything with stdin.

> 
> So I would say, in this state rlWatchdog does not work for any command
> requesting input from stdin. Not so good and not so bad, beakerlib user can (if
> know about it!) add dev null redirection by himself.
User or "some" magic in beakerlib.

> 
> Assuming we are doing automated testing, interactive input should not be needed
> anyway anytime. So what about changing eval in rlWatchdog from:
> 
> eval "$command; touch __INTERNAL_FINISHED" &
> 
> to 
> 
> eval "$command; touch __INTERNAL_FINISHED" < /dev/null &
> 
> We cannot add redirection inside eval, since $command can be structured in any
> way. Does this makes sense?
I would not go redirection in this way. I'm not sure with this, but what about to close stdin? Redirect stdin? or...

In case of closing stdin, I would do tests with processes which do some magic with stdin - shells for example.

Comment 7 Petr Muller 2013-01-08 12:04:57 UTC
Some time ago, coreutils learned 'timeout' command. I'll make a research about whether it is present in supported older RHELs and port watchdog to it where possible (and deprecate it, because it really does not add any value now)

Comment 8 Fedora End Of Life 2013-04-03 19:39:42 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 9 Petr Muller 2013-12-10 14:55:42 UTC
After re-reading, I would say this is not really a bug - the command under rlWatchdog simply did not end, and rlWatchdog noticed it and killed it. From my POV, everything works as expected...