Description of problem: I am setting date few days into the future, and it looks like such a test sometimes freezes and other does not continue Version-Release number of selected component (if applicable): Version - 0.5.58 How reproducible: always Steps to Reproduce: 1. make a test with: date -s tomorrow date -s tomorrow date -s tomorrow date -s tomorrow ## do some testing here date -s yesterday date -s yesterday date -s yesterday date -s yesterday Actual results: test freezes, there is not log other test doesn't not continue Expected results: test is logged and other test continue Additional info:
Marian, not sure how you are using date/time in beah. If you look at rhts-test-runner.sh you will notice we use /proc/uptime to deal with this exact possibility.
I wrote and run a test which did time-shift several times there and back without trouble. It's not obvious what happened, but it's not just the time-shift. I will look into it. Bill, twisted does it for me. And perhaps it's wrong (or shall I say correct?) somewhere...
Ping - is this still an issue?
Petre, do you remember which task it was? The original jobs were removed and I would like to check if it is still an issue.
Hello, I found revision of runtest.sh which was causing this issue: http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/RHN-Satellite/rhn-satellite-exporter/Regression/bz589524-startdate-enddate/runtest.sh?rev=1.12;content-type=text%2Fplain the issue was part with date settings: date -s "1 day" # do the testing here, report result into beaker date -s "1 day ago" -- Test stop continue and after many hours it was killed by watchdog. In the meantime I could log into machine and restart /etc/init.d/beaker* services - then test continues.
Hi, From my point of view, watchdog should not be leaded by local time, but by some external time server (should not be synced with internal clock) and every x ticks of processor should ask for time, and only in case timeserver isn't aviable (for examle test with some network magic, should use local time. It seems that problem can be similar to my filled ticket: [engineering.redhat.com #112459] AutoReply: beaker localwatchdog problem kills whole recipe copied here: ------------------------------------------------------------------------- Hi, I have several jobs where localwatchdog caused external watchodog kill. It seems that something wrong happen in beaker, here are: https://beaker.engineering.redhat.com/jobs/93161 https://beaker.engineering.redhat.com/jobs/93234 https://beaker.engineering.redhat.com/jobs/93235 Thanks Honza
(In reply to comment #8) > Hi, > From my point of view, watchdog should not be leaded by local time, but by some > external time server (should not be synced with internal clock) and every x > ticks of processor should ask for time, and only in case timeserver isn't > aviable (for examle test with some network magic, should use local time. > It seems that problem can be similar to my filled ticket: > [engineering.redhat.com #112459] AutoReply: beaker localwatchdog problem kills > whole recipe > > copied here: > ------------------------------------------------------------------------- > Hi, > I have several jobs where localwatchdog caused external watchodog kill. > It seems that something wrong happen in beaker, here are: > https://beaker.engineering.redhat.com/jobs/93161 Marian, I looked at the first job and it correctly executed the localwatchdog code but then after it rebooted it keeps going into the task that aborted. It should be going to the next task. Each time it does this it reboots and then finally the external watchdog aborts the recipe. > https://beaker.engineering.redhat.com/jobs/93234 > https://beaker.engineering.redhat.com/jobs/93235 > > Thanks > Honza
*** Bug 849587 has been marked as a duplicate of this bug. ***
Twisted handles delayedCall using system time (time.time) - see _timeFunction in twisted.python.runtime. Not sure changing the _timeFunction for unix systems to use uptime would be safe, but it looks like the easiest way... Function getting uptime in seconds has to be (somehow?) passed to: - all DelayedCall (easy: we could pass it as seconds argument?)[1] - but it does not deal with any internal ocurences - all reactor.callLater calls :-/ - by setting reactor.seconds to the function? Also time.time is used directly elsewhere to get time-delta, instead of seconds. [1] http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.base.DelayedCall.html#__init__
Bulk reassignment of issues as Bill has moved to another team.
*** Bug 1069210 has been marked as a duplicate of this bug. ***
libfaketime sounds like a good solution for testing time changes without affecting the actual system clock. If a task wants to use libfaketime it can install it, I don't think there is anything Beaker itself needs to do in that case. As for the actual bug in beah about system time changing, there is nothing new to report. Comment 11 indicates it's an issue in Twisted itself.
> As for the actual bug in beah about system time changing, there is nothing > new to report. Comment 11 indicates it's an issue in Twisted itself. Here is a potentially relevant Twisted ticket: http://twistedmatrix.com/trac/ticket/2424
Greetings, We are seeing that beaker jobs that have tests where system time is changed are failing Example: https://beaker.engineering.redhat.com/jobs/641101 We are seeing below messages through journalctl. Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: Unhandled Error Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: Traceback (most recent call last): Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: File "/usr/bin/beah-beaker-backend", line 9, in <module> Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: load_entry_point('beah==0.7.3.dev201402241149', 'console_scripts', 'beah-beaker-backend')() Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 2072, in main Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: debug.runcall(reactor.run) Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: File "/usr/lib/python2.7/site-packages/beah/core/debug.py", line 11, in runcall Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: a_callable(*args, **kwargs) Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1169, in run Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: self.mainLoop() Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: --- <exception caught here> --- Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1181, in mainLoop Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: self.doIteration(t) Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: File "/usr/lib64/python2.7/site-packages/twisted/internet/epollreactor.py", line 362, in doPoll Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: l = self._poller.poll(timeout, len(self._selectables)) Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: exceptions.OverflowError: timeout is too large Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: Unhandled Error Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com beah-beaker-backend[668]: Traceback (most recent call last): Versions: beaker-distribution-install-1.12-1.noarch beakerlib-redhat-1-6.fc16.noarch beakerlib-1.8-4.fc20.noarch Fedora release 20 (Heisenbug) kernel-3.13.10-200.fc20.x86_64 chrony-1.29.1-1.fc20.x86_64 I found some bugs related to above errors: https://bugzilla.redhat.com/show_bug.cgi?id=1063647 https://bugzilla.redhat.com/show_bug.cgi?id=950646 Below is the example test that we have where we modify the dates. (in scenarios where we need to work with expired ssl certs) rlPhaseStartTest "pki_cert_revoke_0032: Revoke an expired cert" local endDate="1 month" rlRun "modify_cert $TEMP_NSS_DB $cert_info $exp \"$endDate\"" 0 "Generate Modified Cert" local cert_end_date=$(cat $cert_info| grep cert_end_date | cut -d- -f2) local cur_date=$(date) local cur_num_days=$(echo $(date -d "$cur_date" +%j)) rlRun "chronyc -a offline 1> $TmpDir/chrony.out" 0 "Set time servers to offline" rlAssertGrep "200 OK" "$TmpDir/chrony.out" rlRun "date -s '$cert_end_date + 1 day'" local end_num_days=$(echo $(date -d "$cert_end_date + 1 day" +%j)) local to_be_back=$(echo $(expr $end_num_days - $cur_num_days)) local cert_serialNumber=$(cat $cert_info| grep cert_serialNumber | cut -d- -f2) rlRun "pki -d $CERTDB_DIR \ -c $CERTDB_DIR_PASSWORD \ -n \"$CA_agentV_user\" \ cert-revoke $cert_serialNumber --force --reason Key_Compromise 1> $expout" 0 rlAssertGrep "Revoked certificate \"$cert_serialNumber\"" "$expout" rlAssertGrep "Serial Number: $cert_serialNumber" "$expout" rlAssertGrep "Issuer: CN=CA Signing Certificate,O=$CA_DOMAIN Security Domain" "$expout" rlAssertGrep "Issuer: CN=CA Signing Certificate,O=$CA_DOMAIN Security Domain" "$expout" rlAssertGrep "Status: REVOKED" "$expout" rlLog "Set the date back to it's original date & time" rlRun "date -s '$to_be_back days ago'" rlRun "chronyc -a online 1> $TmpDir/chrony.out" 0 "Make Time servers online" rlAssertGrep "200 OK" "$TmpDir/chrony.out" rlRun "chronyc -a makestep 1> $TmpDir/chrony.out" 0 "Step-Up system clock" rlAssertGrep "200 OK" "$TmpDir/chrony.out" rlRun "chronyc -a makestep 1> $TmpDir/chrony.out" 0 "Step-Up system clock" rlAssertGrep "200 OK" "$TmpDir/chrony.out" rlPhaseEnd
As noted in comment:18 above, please consider using https://github.com/wolfcw/libfaketime for any date/time changes. This is a issue with Twisted which the test harness relies on.
Actionpoints: * WONTFIX for beah * Test if restraint can handle this scenario. If it can't, this should be fixed in restraint.
Beah is no longer supported by Beaker development team. Instead of that, we are working on Restraint test harness. You can find all the features of Restraint here. https://restraint.readthedocs.io/en/latest/ If you think your RFE should be still implemented as part of Restraint feel free to create a new BZ ticket. https://bugzilla.redhat.com/enter_bug.cgi?product=Restraint In case you have any question feel free to reach out to me Thank you, Martin Styk <martin.styk>