639255 – setting date few days into the future make test freezes

Bug 639255 - setting date few days into the future make test freezes

Summary: setting date few days into the future make test freezes

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Beaker
Classification:	Retired
Component:	beah
Sub Component:
Version:	0.5
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Assignee:	beaker-dev-list
QA Contact:	tools-bugs
Docs Contact:
URL:
Whiteboard:	SimpleHarness
Duplicates (2):	849587 1069210 (view as bug list)
Depends On:
Blocks:	545868
TreeView+	depends on / blocked

Reported:	2010-10-01 09:22 UTC by Petr Sklenar
Modified:	2020-02-11 12:18 UTC (History)
CC List:	16 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2020-02-11 12:15:23 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1063647	0	unspecified	CLOSED	Twisted's callLater() method is called with too large a value, resulting in OverflowError: timeout is too large	2021-02-22 00:41:40 UTC

Internal Links: 1063647

Description Petr Sklenar 2010-10-01 09:22:39 UTC

Description of problem:
I am setting date few days into the future, and it looks like such a test sometimes freezes and other does not continue

Version-Release number of selected component (if applicable):
Version - 0.5.58 

How reproducible:
always

Steps to Reproduce:
1. make a test with:
date -s tomorrow
date -s tomorrow
date -s tomorrow
date -s tomorrow
## do some testing here 
date -s yesterday
date -s yesterday
date -s yesterday
date -s yesterday

Actual results:
test freezes, there is not log
other test doesn't not continue

Expected results:
test is logged and other test continue

Additional info:

Comment 3 Bill Peck 2010-10-01 12:55:25 UTC

Marian, not sure how you are using date/time in beah.  If you look at rhts-test-runner.sh you will notice we use /proc/uptime to deal with this exact possibility.

Comment 4 Marian Csontos 2010-10-01 13:07:59 UTC

I wrote and run a test which did time-shift several times there and back without trouble. It's not obvious what happened, but it's not just the time-shift. I will look into it.

Bill, twisted does it for me. And perhaps it's wrong (or shall I say correct?) somewhere...

Comment 5 Bill Peck 2011-03-23 20:47:29 UTC

Ping - is this still an issue?

Comment 6 Marian Csontos 2011-06-02 10:28:09 UTC

Petre, do you remember which task it was? The original jobs were removed and I would like to check if it is still an issue.

Comment 7 Petr Sklenar 2011-06-07 09:07:14 UTC

Hello,
I found revision of runtest.sh which was causing this issue:

http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/RHN-Satellite/rhn-satellite-exporter/Regression/bz589524-startdate-enddate/runtest.sh?rev=1.12;content-type=text%2Fplain

the issue was part with date settings:

date -s "1 day"
# do the testing here, report result into beaker
date -s "1 day ago"

--

Test stop continue and after many hours it was killed by watchdog.
In the meantime I could log into machine and restart /etc/init.d/beaker* services - then test continues.

Comment 8 Jan Ščotka 2011-06-15 08:19:03 UTC

Hi,
From my point of view, watchdog should not be leaded by local time, but by some external time server (should not be synced with internal clock) and every x ticks of processor  should ask for time, and only in case timeserver isn't aviable (for examle test with some network magic, should use local time. 
It seems that problem can be similar to my filled ticket:

[engineering.redhat.com #112459] AutoReply: beaker localwatchdog problem kills whole recipe

copied here:
-------------------------------------------------------------------------
Hi,
I have several jobs where localwatchdog caused external watchodog kill.
It seems that something wrong happen in beaker, here are:
https://beaker.engineering.redhat.com/jobs/93161
https://beaker.engineering.redhat.com/jobs/93234
https://beaker.engineering.redhat.com/jobs/93235

     Thanks
     Honza

Comment 9 Bill Peck 2011-06-15 14:30:20 UTC

(In reply to comment #8)
> Hi,
> From my point of view, watchdog should not be leaded by local time, but by some
> external time server (should not be synced with internal clock) and every x
> ticks of processor  should ask for time, and only in case timeserver isn't
> aviable (for examle test with some network magic, should use local time. 
> It seems that problem can be similar to my filled ticket:

> [engineering.redhat.com #112459] AutoReply: beaker localwatchdog problem kills
> whole recipe
> 
> copied here:
> -------------------------------------------------------------------------
> Hi,
> I have several jobs where localwatchdog caused external watchodog kill.
> It seems that something wrong happen in beaker, here are:
> https://beaker.engineering.redhat.com/jobs/93161

Marian,

I looked at the first job and it correctly executed the localwatchdog code but then after it rebooted it keeps going into the task that aborted.  It should be going to the next task.  Each time it does this it reboots and then finally the external watchdog aborts the recipe.

> https://beaker.engineering.redhat.com/jobs/93234
> https://beaker.engineering.redhat.com/jobs/93235
> 
>      Thanks
>      Honza

Comment 10 Marian Csontos 2012-09-14 12:55:14 UTC

*** Bug 849587 has been marked as a duplicate of this bug. ***

Comment 11 Marian Csontos 2012-09-14 13:01:13 UTC

Twisted handles delayedCall using system time (time.time) - see _timeFunction in twisted.python.runtime.

Not sure changing the _timeFunction for unix systems to use uptime would be safe, but it looks like the easiest way...

Function getting uptime in seconds has to be (somehow?) passed to:

- all DelayedCall (easy: we could pass it as seconds argument?)[1]
  - but it does not deal with any internal ocurences

- all reactor.callLater calls :-/
  - by setting reactor.seconds to the function?

Also time.time is used directly elsewhere to get time-delta, instead of seconds.

[1] http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.base.DelayedCall.html#__init__

Comment 12 Nick Coghlan 2012-10-17 04:36:01 UTC

Bulk reassignment of issues as Bill has moved to another team.

Comment 16 Dan Callaghan 2014-02-24 22:22:50 UTC

*** Bug 1069210 has been marked as a duplicate of this bug. ***

Comment 18 Dan Callaghan 2014-04-06 23:00:25 UTC

libfaketime sounds like a good solution for testing time changes without affecting the actual system clock. If a task wants to use libfaketime it can install it, I don't think there is anything Beaker itself needs to do in that case.

As for the actual bug in beah about system time changing, there is nothing new to report. Comment 11 indicates it's an issue in Twisted itself.

Comment 19 Amit Saha 2014-04-07 05:58:36 UTC

> As for the actual bug in beah about system time changing, there is nothing
> new to report. Comment 11 indicates it's an issue in Twisted itself.

Here is a potentially relevant Twisted ticket: http://twistedmatrix.com/trac/ticket/2424

Comment 20 Niranjan Mallapadi Raghavender 2014-04-25 13:58:59 UTC

Greetings, 

We are seeing that beaker jobs that have tests where system time is changed are failing 

Example:
https://beaker.engineering.redhat.com/jobs/641101


We are seeing below messages through journalctl.

Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: Unhandled Error
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: Traceback (most recent call last):
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: File "/usr/bin/beah-beaker-backend", line 9,
in <module>
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]:
load_entry_point('beah==0.7.3.dev201402241149', 'console_scripts',
'beah-beaker-backend')()
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: File
"/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 2072,
in main
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: debug.runcall(reactor.run)
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: File
"/usr/lib/python2.7/site-packages/beah/core/debug.py", line 11, in runcall
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: a_callable(*args, **kwargs)
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: File
"/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line
1169, in run
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: self.mainLoop()
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: --- <exception caught here> ---
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: File
"/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line
1181, in mainLoop
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: self.doIteration(t)
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: File
"/usr/lib64/python2.7/site-packages/twisted/internet/epollreactor.py",
line 362, in doPoll
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: l = self._poller.poll(timeout,
len(self._selectables))
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: exceptions.OverflowError: timeout is too large
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: Unhandled Error
Apr 25 13:39:24 cloud-qe-20.idm.lab.bos.redhat.com
beah-beaker-backend[668]: Traceback (most recent call last):


Versions:
beaker-distribution-install-1.12-1.noarch
beakerlib-redhat-1-6.fc16.noarch
beakerlib-1.8-4.fc20.noarch
Fedora release 20 (Heisenbug)
kernel-3.13.10-200.fc20.x86_64
chrony-1.29.1-1.fc20.x86_64

I found some bugs related to above errors:

https://bugzilla.redhat.com/show_bug.cgi?id=1063647
https://bugzilla.redhat.com/show_bug.cgi?id=950646

Below is the example test that we have where we modify the dates. (in scenarios where we need to work with expired ssl certs)

        rlPhaseStartTest "pki_cert_revoke_0032: Revoke an expired cert"
        local endDate="1 month"
        rlRun "modify_cert $TEMP_NSS_DB $cert_info $exp \"$endDate\"" 0 "Generate Modified Cert"
        local cert_end_date=$(cat $cert_info| grep cert_end_date | cut -d- -f2)
        local cur_date=$(date)
        local cur_num_days=$(echo $(date -d "$cur_date" +%j))
        rlRun "chronyc -a offline 1> $TmpDir/chrony.out" 0 "Set time servers to offline"
        rlAssertGrep "200 OK" "$TmpDir/chrony.out"
        rlRun "date -s '$cert_end_date + 1 day'"
        local end_num_days=$(echo $(date -d "$cert_end_date + 1 day" +%j))
        local to_be_back=$(echo $(expr $end_num_days - $cur_num_days))
        local cert_serialNumber=$(cat $cert_info| grep cert_serialNumber | cut -d- -f2)
        rlRun "pki -d  $CERTDB_DIR \
                -c $CERTDB_DIR_PASSWORD \
                -n \"$CA_agentV_user\" \
                cert-revoke $cert_serialNumber --force --reason Key_Compromise 1> $expout" 0
        rlAssertGrep "Revoked certificate \"$cert_serialNumber\"" "$expout"
        rlAssertGrep "Serial Number: $cert_serialNumber" "$expout"
        rlAssertGrep "Issuer: CN=CA Signing Certificate,O=$CA_DOMAIN Security Domain" "$expout"
        rlAssertGrep "Issuer: CN=CA Signing Certificate,O=$CA_DOMAIN Security Domain" "$expout"
        rlAssertGrep "Status: REVOKED" "$expout"
        rlLog "Set the date back to it's original date & time"
        rlRun "date -s '$to_be_back days ago'"
        rlRun "chronyc -a online 1> $TmpDir/chrony.out" 0 "Make Time servers online"
        rlAssertGrep "200 OK" "$TmpDir/chrony.out"
        rlRun "chronyc -a makestep 1> $TmpDir/chrony.out" 0 "Step-Up system clock"
        rlAssertGrep "200 OK" "$TmpDir/chrony.out"
        rlRun "chronyc -a makestep 1> $TmpDir/chrony.out" 0 "Step-Up system clock"
        rlAssertGrep "200 OK" "$TmpDir/chrony.out"
        rlPhaseEnd

Comment 21 Amit Saha 2014-04-27 09:31:07 UTC

As noted in comment:18 above, please consider using https://github.com/wolfcw/libfaketime for any date/time changes. This is a issue with Twisted which the test harness relies on.

Comment 22 Roman Joost 2016-09-12 23:39:57 UTC

Actionpoints:

* WONTFIX for beah
* Test if restraint can handle this scenario. If it can't, this should be fixed in restraint.

Comment 23 Martin Styk 2020-02-11 12:15:23 UTC

Beah is no longer supported by Beaker development team.
Instead of that, we are working on Restraint test harness. You can find all the features of Restraint here.

https://restraint.readthedocs.io/en/latest/

If you think your RFE should be still implemented as part of Restraint feel free to create a new BZ ticket.

https://bugzilla.redhat.com/enter_bug.cgi?product=Restraint

In case you have any question feel free to reach out to me
Thank you,
Martin Styk <martin.styk>

Note You need to log in before you can comment on or make changes to this bug.