Bug 1965718 - anaconda sets system time an hour fast during daylight savings for timezones that observe it (affects FreeIPA, rsyslog...)
Summary: anaconda sets system time an hour fast during daylight savings for timezones ...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: rawhide
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Anaconda Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks: F35BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2021-05-28 22:46 UTC by Adam Williamson
Modified: 2021-07-02 21:55 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-02 21:55:12 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
client logs from a failed test (919.06 KB, application/octet-stream)
2021-05-28 22:49 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2021-05-28 22:46:49 UTC
Since Fedora-Rawhide-20210520.n.1 , the openQA test that enrols a system to a FreeIPA domain during initial install and then verifies it is failing. The install completes successful, and the initial checks pass, but when the test tries to log in as a domain user, the login fails.

The similar test which enrols after install (rather than during install) is not failing, so there's something specific to the install scenario here.

In that compose, both anaconda and sssd changed. anaconda changed from 35.14-1.fc35 to 35.15-1.fc35. sssd changed from 2.5.0-1.fc35 to 2.5.0-2.fc35. I'm filing against anaconda as it seems to be an install-related bug, and also because some of the changes in anaconda were timezone related, and there is this error in the client side sssd logs:

(2021-05-28  9:52:49): [krb5_child[958]] [get_and_save_tgt] (0x0020): 1721: [-1765328347][Clock skew too great]

The test was passing consistently until Fedora-Rawhide-20210520.n.1, it has failed in every compose since. I'm attaching the client side logs. We don't have the server logs, it is tricky to get them due to race issues but I can try if necessary.

Proposing as a Beta blocker per Basic criterion "It must be possible to join the system to a FreeIPA or Active Directory domain at install time and post-install, and the system must respect the identity, authentication and access control configuration provided by the domain."

Comment 1 Adam Williamson 2021-05-28 22:48:54 UTC
For clarity, the test runs the install (during which the domain enrolment happens), then boots the installed system and logs in as root (using a local root account), runs some general checks that the system does indeed appear to be properly enrolled in the domain (which pass), then it logs out and tries to log in using a domain user account "test1.fedoraproject.org". That login fails.

Comment 2 Adam Williamson 2021-05-28 22:49:28 UTC
Created attachment 1787965 [details]
client logs from a failed test

Comment 3 Adam Williamson 2021-06-23 21:00:30 UTC
OK, so the underlying problem here is: if installing for a timezone that does daylight savings time, during daylight savings time, anaconda sets the system clock an hour fast. This I believe is causing the FreeIPA enrolment issue. It also causes the openQA remote logging test to fail if run within an hour of the initial install test, I think because the log message that we send to test remote logging appears to be dated "before" the most recent message in the journal, causing rsyslog's imjournal plugin not to pick it up.

Release criterion stays the same, with the note it only affects installs during DST in timezones that do it (but could cause various other things to break too).

I believe the bug was introduced by https://github.com/rhinstaller/anaconda/pull/3167 . I'm trying to see if I can figure out a fix now.

Comment 4 Adam Williamson 2021-06-23 22:34:23 UTC
OK, so, whew, there is a *heck* of a history here, it turns out:

https://bugzilla.redhat.com/show_bug.cgi?id=1251044 led to https://github.com/rhinstaller/anaconda/commit/b3096e797a3b76862f94662a865c6a83854f45c4
https://bugzilla.redhat.com/show_bug.cgi?id=1367647 led to https://src.fedoraproject.org/rpms/libtimezonemap/c/044b30108b0297d8655111a1663a348a2e6e75c0?branch=rawhide
https://bugzilla.redhat.com/show_bug.cgi?id=1433560 led to https://github.com/rhinstaller/anaconda/commit/e2beb949c41fee41c05a0f537d0f38efb321ab37

so the code started simple but we made it progressively more complicated to deal with various bugs and behaviour changes in Python, starting with "however, any conversions to and from UTC or calculations between different timezones currently return incorrect values due to a known issue in the Python datetime library." The recent commit actually gets it back closer to where it started, but not quite all the way.

https://bugzilla.redhat.com/show_bug.cgi?id=1293314 - which was a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1251044 back in the day - has some interesting references on the problems that made us start complicating this in the first place, under "k)" in the original comment. The test code there is still valid (with an extra pair of brackets for `print`), and interestingly, with pytz the "bugs" still exist today:

>>> print(datetime.datetime(2016,1,1,0,0,0,tzinfo=pytz.timezone('Asia/Aden')).utcoffset())
3:07:00
>>> print(datetime.datetime(2016,1,1,0,0,0,tzinfo=pytz.timezone('Asia/Kolkata')).utcoffset())
5:53:00

but with zoneinfo, they don't:

>>> print(datetime.datetime(2016,1,1,0,0,0,tzinfo=zoneinfo.ZoneInfo('Asia/Aden')).utcoffset())
3:00:00
>>> print(datetime.datetime(2016,1,1,0,0,0,tzinfo=zoneinfo.ZoneInfo('Asia/Kolkata')).utcoffset())
5:30:00

so ultimately what I'm gonna suggest is simply that we go back to basics and trust python to handle the timezone math:

-    epoch = datetime.datetime.fromtimestamp(0, tz=utc).astimezone(tz)
+    epoch = datetime.datetime.fromtimestamp(0, tz=utc)

which is almost exactly where we started. That change 'fixes' my test script. Interestingly if you use the same epoch definition in `set_system_date_time_old` it still works, even for Kolkata, so somehow the tz issue from #1293314 doesn't cause problems. I also tested applying an explicit time (rather than 'now') and couldn't find any problems that way.

Comment 5 Adam Williamson 2021-06-24 00:04:10 UTC
https://github.com/rhinstaller/anaconda/pull/3469 confirmed to fix this for the case openQA hits.

Comment 6 Adam Williamson 2021-07-02 21:55:12 UTC
This got merged and pushed stable, the tests that failed are now passing.


Note You need to log in before you can comment on or make changes to this bug.