Bug 2184223

Summary: Joining AD realm with kickstart no longer works in F38
Product: [Fedora] Fedora Reporter: Tim Flink <junk.rhbz.oldemail.DO.NOT.USE>
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 38CC: anaconda-maint-list, awilliam, gmarr, robatino, vponcova, vslavik, w
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: RejectedBlocker RejectedFreezeException
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-04-11 17:49:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
failed-ks-anaconda.log
none
failed-ks-program.log
none
failed-ks-journal.log
none
failed-ks-syslog
none
failed-ks-packaging.log
none
realmd-ad-join.ks none

Description Tim Flink 2023-04-04 00:34:45 UTC
Created attachment 1955641 [details]
failed-ks-anaconda.log

Description of problem:
Using 'realm join' in a kickstart against a windows AD server does not result in a machine which is joined to the domain once the machine is installed.


Version-Release number of selected component (if applicable):
Fedora-Server-dvd-x86_64-38-20230331.n.0.iso

How reproducible:
I'm able to reproduce the issue about 9/10 times

Steps to Reproduce:
1. Follow https://fedoraproject.org/wiki/QA:Testcase_realmd_join_kickstart
2. Upon completed install, run 'realm list'

Actual results:
empty result from 'realm list' command

Expected results:
machine joined to AD domain after installation

Additional info:
F38 install is done in a KVM vm, Domain controller is Windows Server 2019 in a KVM vm on the same virthost.

Comment 1 Tim Flink 2023-04-04 00:37:22 UTC
Created attachment 1955642 [details]
failed-ks-program.log

Comment 2 Tim Flink 2023-04-04 00:39:57 UTC
Created attachment 1955643 [details]
failed-ks-journal.log

Comment 3 Tim Flink 2023-04-04 00:42:00 UTC
Created attachment 1955644 [details]
failed-ks-syslog

Comment 4 Tim Flink 2023-04-04 00:43:09 UTC
Created attachment 1955645 [details]
failed-ks-packaging.log

Comment 5 Tim Flink 2023-04-04 15:08:20 UTC
Created attachment 1955704 [details]
realmd-ad-join.ks

Comment 6 Tim Flink 2023-04-04 16:06:04 UTC
I forgot to put this in the initial description - I attempted to reproduce the issue using F37 and was not successful. I have only been able to see this behavior using F38.

Comment 7 Adam Williamson 2023-04-04 16:21:12 UTC
Here's the relevant bit from the journal with AVCs cut out for clarity:

Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:anaconda.threading:Running Thread: AnaTaskThread-RealmJoinTask-1 (140000000861888)
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:anaconda.modules.common.task.task:Join a realm
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:Running... realm join --install /mnt/sysroot --verbose --one-time-password=Password123 tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Loaded settings from: /usr/lib/realmd/realmd-defaults.conf /usr/lib/realmd/realmd-distro.conf
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: holding daemon: persist-daemon
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: holding daemon: startup
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: starting service
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: connected to peer
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: GLib-GIO: _g_io_module_get_default: Found default implementation local (GLocalVfs) for ‘gio-vfs’
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: released daemon: startup
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: holding daemon: :peer
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Using 'r256.26491' operation for method 'Discover' invocation on 'org.freedesktop.realmd.Provider' interface
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Registered cancellable for operation 'r256.26491'
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Resolving: _ldap._tcp.tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: GLib-GIO: GTask [gio] D-Bus interface method dispatch (source object: 0x55856bd6e0d0, source tag: 0x7fabdd7cfb80) finalized without ever returning (using g_task_return_*()). This potentially indicates a bug in the program.
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Performing LDAP DSE lookup on: 192.168.122.10
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Searching  for (objectClass=*)
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Got defaultNamingContext: DC=tirfa,DC=net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Sending TCP Netlogon request
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Received TCP Netlogon response
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Successfully discovered: tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Using 'r256.26491' operation for method 'Join' invocation on 'org.freedesktop.realmd.KerberosMembership' interface
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: Registered cancellable for operation 'r256.26491'
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: holding daemon: current-invocation
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Assuming packages are installed
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: GLib-GIO: GTask [gio] D-Bus interface method dispatch (source object: 0x55856bd8fcf0, source tag: 0x7fabdd7cfb80) finalized without ever returning (using g_task_return_*()). This potentially indicates a bug in the program.
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * LANG=C /usr/sbin/adcli join --verbose --domain tirfa.net --domain-realm TIRFA.NET --domain-controller 192.168.122.10 --login-type computer --stdin-password
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: process started: 26502
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Using domain name: tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Calculated computer account name from fqdn: ANACONDA1
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Using domain realm: tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Sending NetLogon ping to domain controller: 192.168.122.10
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Received NetLogon info from: WIN-UQK0LV9GP4U.tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  * Wrote out krb5.conf snippet to /var/cache/realmd/adcli-krb5-YX5myI/krb5.d/adcli-krb5-conf-CkwYuF
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  ! Couldn't authenticate as machine account: ANACONDA1: Preauthentication failed
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: adcli: couldn't connect to tirfa.net domain: Couldn't authenticate as machine account: ANACONDA1: Preauthentication failed
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: process exited: 26502
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]:  ! Failed to join the domain
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: released daemon: current-invocation
Apr 03 23:32:58 anaconda1.tirfa.net realmd[26492]: stopping service

Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Resolving: _ldap._tcp.tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Performing LDAP DSE lookup on: 192.168.122.10
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Successfully discovered: tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Assuming packages are installed
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* LANG=C /usr/sbin/adcli join --verbose --domain tirfa.net --domain-realm TIRFA.NET --domain-controller 192.168.122.10 --login-type computer --stdin-password
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Using domain name: tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Calculated computer account name from fqdn: ANACONDA1
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Using domain realm: tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Sending NetLogon ping to domain controller: 192.168.122.10
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Received NetLogon info from: WIN-UQK0LV9GP4U.tirfa.net
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:* Wrote out krb5.conf snippet to /var/cache/realmd/adcli-krb5-YX5myI/krb5.d/adcli-krb5-conf-CkwYuF
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:! Couldn't authenticate as machine account: ANACONDA1: Preauthentication failed
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:adcli: couldn't connect to tirfa.net domain: Couldn't authenticate as machine account: ANACONDA1: Preauthentication failed
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:! Failed to join the domain
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:program:realm: Couldn't join realm: Failed to join the domain
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: DEBUG:program:Return code: 1
Apr 03 23:32:58 anaconda1.tirfa.net org.fedoraproject.Anaconda.Modules.Security[2055]: INFO:anaconda.modules.security.installation:Joining realm tirfa.net failed

(I think the anaconda messages just recap the realmd messages). So, the key thing is "adcli: couldn't connect to tirfa.net domain: Couldn't authenticate as machine account: ANACONDA1: Preauthentication failed".

Comment 8 Fedora Blocker Bugs Application 2023-04-04 17:22:10 UTC
Proposed as a Blocker and Freeze Exception for 38-final by Fedora user tflink using the blocker tracking app because:

 This violates the basic release criterion: "It must be possible to join the system to a FreeIPA or Active Directory domain at install time and post-install, and the system must respect the identity, authentication and access control configuration provided by the domain."

Comment 9 Tim Flink 2023-04-04 18:44:15 UTC
After more testing, neither I nor the other person who generally runs the AD test cases can reproduce this issue.

I have no explanation for why I'm no longer able to reproduce the issue - I'm using the same commands, the same kickstart, the same windows server and VMs on the same networks on the same virtnosts.

I dislike "this magically started working" but the evidence points to either tester error or some transient background cause that isn't reliable enough to troubleshoot. I'm putting a call out for more testing but if we can't regularly reproduce this by the end of the week, I propose that we close this bug as invalid.

Comment 10 Adam Williamson 2023-04-06 19:02:08 UTC
-4 in https://pagure.io/fedora-qa/blocker-review/issue/1139 , rejecting as a blocker. We can re-propose this if we figure out how to reproduce it and it still looks bad.

Comment 11 Geoffrey Marr 2023-04-10 16:41:35 UTC
Discussed during the 2023-04-10 blocker review meeting: [0]

The decision to classify this bug as a "RejectedFreezeException (Final)" was made because we don't have a clear reproducer.

[0] https://meetbot.fedoraproject.org/fedora-blocker-review/2023-04-10/f38-blocker-review.2023-04-10-16.00.txt

Comment 12 Tim Flink 2023-04-11 17:49:18 UTC
I still can't reproduce this after the first 10 tries or so. I don't know what changed or why I'm no longer able to reproduce the issue. Seeing as I no longer have a reproducer for this, I'm closing the bug.