Hide Forgot
Description of problem: SIGXFSZ signal behaviour is different while running under harness and it seems python is to be blamed for this. The full describe pls refer to below link: http://post-office.corp.redhat.com/archives/beaker-dev-list/2011-November/msg00027.html Version-Release number of selected component (if applicable): # rpm -qa|grep beah beah-0.6.34-2.el6_0.noarch How reproducible: always Steps to Reproduce: clone the job https://beaker.engineering.redhat.com/jobs/160167 Actual results: unexpected behavior in beaker, but run ok by manual. Expected results: Snippet of the log: > :: [ LOG ] :: mount > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > ... > :: [ INFO ] :: /etc/mtab md5 is now 0de9b48c805bfd985a49e1e512b622f2 > :: [ PASS ] :: Comparing the old and new mds5sum of /etc/mtab > :: [ PASS ] :: Should not left stale file /etc/mtab~ > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > |-> I need to get FAIL by find the stale file /etc/mtab~ > > > another one: > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > :: [ LOG ] :: corrupt > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > > :: [ PASS ] :: Remove stale /etc/mtab~ in order to umount successfully > :: [ PASS ] :: Umount done to prepare next test > :: [ PASS ] :: No localhost:/tmp entry in /etc/mtab now > :: [ PASS ] :: Adding the testing user > :: [ PASS ] :: Backing up the mtab > :: [ FAIL ] :: Trying to corrupt mtab with mount (Expected 153,16, got 0) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > |-> I need to get PASS by non-zero return code of command Additional info:
(In reply to comment #0) > Description of problem: > SIGXFSZ signal behaviour is different while running under harness and it seems > python is to be blamed for this. > The full describe pls refer to below link: > http://post-office.corp.redhat.com/archives/beaker-dev-list/2011-November/msg00027.html > > Version-Release number of selected component (if applicable): > # rpm -qa|grep beah > beah-0.6.34-2.el6_0.noarch > > How reproducible: > always > > Steps to Reproduce: > clone the job https://beaker.engineering.redhat.com/jobs/160167 > > Actual results: > unexpected behavior in beaker, but run ok by manual. > > Expected results: > Snippet of the log: > > :: [ LOG ] :: mount > > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > > ... > > :: [ INFO ] :: /etc/mtab md5 is now 0de9b48c805bfd985a49e1e512b622f2 > > :: [ PASS ] :: Comparing the old and new mds5sum of /etc/mtab > > :: [ PASS ] :: Should not left stale file /etc/mtab~ > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > |-> I need to get FAIL by find the stale file /etc/mtab~ > > > > > > another one: > > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > > :: [ LOG ] :: corrupt > > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: > > > > :: [ PASS ] :: Remove stale /etc/mtab~ in order to umount successfully > > :: [ PASS ] :: Umount done to prepare next test > > :: [ PASS ] :: No localhost:/tmp entry in /etc/mtab now > > :: [ PASS ] :: Adding the testing user > > :: [ PASS ] :: Backing up the mtab > > :: [ FAIL ] :: Trying to corrupt mtab with mount (Expected 153,16, got 0) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > |-> I need to get PASS by non-zero return code of command > > Additional info: any update?
I googled a little and found other people hitting this too: 1. http://bugs.python.org/issue1652#msg100047 if you look at giant patch in next comment it does this before execvp: ---- snip ---- if restore_signals: signals = ('SIGPIPE', 'SIGXFZ', 'SIGXFSZ') for sig in signals: if hasattr(signal, sig): signal.signal(getattr(signal, sig), signal.SIG_DFL) ---- /snip ---- 2. http://twistedmatrix.com/trac/ticket/4199 http://twistedmatrix.com/trac/attachment/ticket/4199/4199-3.diff ---- snip ---- for signalnum in range(1, signal.NSIG): if signalnum in (signal.SIGKILL, signal.SIGSTOP): # These two signals (commonly 9 & 19) can't be caught or ignored continue if signal.getsignal(signalnum) == signal.SIG_IGN: # Reset signal handling to the default signal.signal(signalnum, signal.SIG_DFL) ---- /snip ---- My initial approach was very similar: ---- snip ---- for i in range(1, signal.NSIG): try: signal.signal(i, signal.SIG_DFL) except: pass ---- /snip ---- I looked at alternatives: newgrp, sg, but these modify signal handlers/masks even more. Perl seems decent enough to restore signals before exec, but lacks any interface to setgroups().
(In reply to comment #2) > I googled a little and found other people hitting this too: > > 1. http://bugs.python.org/issue1652#msg100047 > if you look at giant patch in next comment it does this before execvp: > ---- snip ---- > if restore_signals: > signals = ('SIGPIPE', 'SIGXFZ', 'SIGXFSZ') > for sig in signals: > if hasattr(signal, sig): > signal.signal(getattr(signal, sig), signal.SIG_DFL) > ---- /snip ---- > > > 2. http://twistedmatrix.com/trac/ticket/4199 > http://twistedmatrix.com/trac/attachment/ticket/4199/4199-3.diff > ---- snip ---- > for signalnum in range(1, signal.NSIG): > if signalnum in (signal.SIGKILL, signal.SIGSTOP): > # These two signals (commonly 9 & 19) can't be caught or ignored > continue > > if signal.getsignal(signalnum) == signal.SIG_IGN: > # Reset signal handling to the default > signal.signal(signalnum, signal.SIG_DFL) > ---- /snip ---- > > > My initial approach was very similar: > ---- snip ---- > for i in range(1, signal.NSIG): > try: > signal.signal(i, signal.SIG_DFL) > except: > pass > ---- /snip ---- > > I looked at alternatives: newgrp, sg, but these modify signal handlers/masks > even more. > > Perl seems decent enough to restore signals before exec, but lacks any > interface to setgroups(). Sorry, I'm not python/perl expert, so what's the solution to let my script get expected behaviour?
(In reply to comment #3) > Sorry, I'm not python/perl expert, so what's the solution to let my script get > expected behaviour? 1. wait until harness gets fixed 2. reset the signal before running your test for example: --- sigxfsz_reset.py --- #!/usr/bin/python import os import sys import signal try: signal.signal(signal.SIGXFSZ, signal.SIG_DFL) except Exception, e: print e sys.stdout.flush() if len(sys.argv) > 1: os.execvp(sys.argv[1], sys.argv[1:]) else: print __file__, 'unexpectedly at the end of chain' --- /snip --- chmod a+x sigxfsz_reset.py ./sigxfsz_reset.py ./runtest.sh Note: I didn't try this with your test, only by checking output of: cat /proc/self/status | grep Sig[BI]
(In reply to comment #4) > (In reply to comment #3) > > Sorry, I'm not python/perl expert, so what's the solution to let my script get > > expected behaviour? > > 1. wait until harness gets fixed > 2. reset the signal before running your test > for example: > > --- sigxfsz_reset.py --- > #!/usr/bin/python > > import os > import sys > import signal > > try: > signal.signal(signal.SIGXFSZ, signal.SIG_DFL) > except Exception, e: > print e > sys.stdout.flush() > > if len(sys.argv) > 1: > os.execvp(sys.argv[1], sys.argv[1:]) > else: > print __file__, 'unexpectedly at the end of chain' > --- /snip --- > > chmod a+x sigxfsz_reset.py > ./sigxfsz_reset.py ./runtest.sh > > Note: I didn't try this with your test, only by checking output of: > cat /proc/self/status | grep Sig[BI] got, thank you for your time and effort to track the issue.
Bill, I think this can be closed now. With introduction of tortilla, initgroups have been changed to reset all signal handlers before exec. Blocked signal mask looks good now: # ps afx | grep make 18400 pts/0 S+ 0:00 \_ grep make 2262 ? S 0:00 \_ make run # cat /proc/2262/status | grep Sig[BI] SigBlk: 0000000000000000 SigIgn: 0000000000000000