Bug 1013624 - realmd join through openlmi ends with error, but join is finallly successfull
realmd join through openlmi ends with error, but join is finallly successfull
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openlmi-providers (Show other bugs)
7.0
All Unspecified
high Severity high
: rc
: ---
Assigned To: Tomas Smetana
David Spurek
:
Depends On:
Blocks: 922080
  Show dependency treegraph
 
Reported: 2013-09-30 09:15 EDT by David Spurek
Modified: 2014-06-17 23:04 EDT (History)
6 users (show)

See Also:
Fixed In Version: openlmi-providers-0.4.1-13
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 09:06:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description David Spurek 2013-09-30 09:15:21 EDT
Description of problem:
realmd join through openlmi ends with error and immediate 'realm list' shows nothing. After 30s 'realm list' show joined domain.

join failed: Join failed ((1, u'CIM_ERR_FAILED: LMI_Realmd: dbus_join_call() failed: (RDCP_ERROR_DBUS(4)) dbus error (org.freedesktop.DBus.Error.NoReply): Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.'))
:: [   FAIL   ] :: Running 'realmd-cim -u pegasus -p uM4twGrL join Amy-admin Pass2012! ad.baseos.qe' (Expected 0, got 1)


Version-Release number of selected component (if applicable):
openlmi-realmd-0.2.0-0.el7.ppc64
openlmi-providers-0.2.0-0.el7.ppc64

How reproducible:
always on ppc64

Steps to Reproduce:
1.join to domain with openlmi
2.
3.

Actual results:


Expected results:


Additional info:
Comment 2 Patrik Kis 2013-10-01 05:04:01 EDT
This problem appears also on s390x:

# realmd-cim -u pegasus -p uM4twGrL join admin Pass2012! ipa.baseos.qe
join failed: Join failed ((1, u'CIM_ERR_FAILED: LMI_Realmd: dbus_join_call() failed: (RDCP_ERROR_DBUS(4)) dbus error (org.freedesktop.DBus.Error.NoReply): Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.'))

but then waiting about 20sec:

# realm list
ipa.baseos.qe
  type: kerberos
  realm-name: IPA.BASEOS.QE
  domain-name: ipa.baseos.qe
  configured: kerberos-member
  server-software: ipa
  client-software: sssd
  required-package: ipa-client
  required-package: oddjob
  required-package: oddjob-mkhomedir
  required-package: sssd
  login-formats: %U
  login-policy: allow-realm-logins
Comment 3 David Spurek 2013-10-17 08:33:03 EDT
immediately after failed join command i can see in var log messages:

Oct 17 08:25:26 ibm-p740-01-lp1 dbus-daemon: dbus[718]: [system] Activating service name='org.freedesktop.realmd' (using servicehelper)
Oct 17 08:25:26 ibm-p740-01-lp1 dbus[718]: [system] Activating service name='org.freedesktop.realmd' (using servicehelper)
Oct 17 08:25:26 ibm-p740-01-lp1 dbus-daemon: dbus[718]: [system] Successfully activated service 'org.freedesktop.realmd'
Oct 17 08:25:26 ibm-p740-01-lp1 dbus[718]: [system] Successfully activated service 'org.freedesktop.realmd'
Oct 17 08:25:28 ibm-p740-01-lp1 realmd: * Resolving: _ldap._tcp.ad.baseos.qe
Oct 17 08:25:29 ibm-p740-01-lp1 realmd: * Performing LDAP DSE lookup on: 10.34.37.22
Oct 17 08:25:30 ibm-p740-01-lp1 realmd: * Performing LDAP DSE lookup on: 2620:52:0:2223:1dfe:a8ea:f0d8:380c
Oct 17 08:25:30 ibm-p740-01-lp1 realmd: * Performing LDAP DSE lookup on: 2620:52:0:2223::1:1
Oct 17 08:25:30 ibm-p740-01-lp1 realmd: * Successfully discovered: ad.baseos.qe
Oct 17 08:25:30 ibm-p740-01-lp1 realmd: * Required files: /usr/sbin/oddjobd, /usr/libexec/oddjob/mkhomedir, /usr/sbin/sssd, /usr/bin/net
Oct 17 08:25:30 ibm-p740-01-lp1 realmd: * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.HUZR4W -U Amy-admin ads join ad.baseos.qe
Oct 17 08:25:34 ibm-p740-01-lp1 realmd: Enter Amy-admin's password:print_kdc_line: can't resolve name for kdc with non-default port [2620:52:0:2223::1:1]. Error Name or service not known


aproximately after minut I can see:
Oct 17 08:26:19 ibm-p740-01-lp1 realmd: DNS update failed: NT_STATUS_UNSUCCESSFUL
Oct 17 08:26:19 ibm-p740-01-lp1 realmd: 
Oct 17 08:26:19 ibm-p740-01-lp1 realmd: Using short domain name -- AD
Oct 17 08:26:19 ibm-p740-01-lp1 realmd: Joined 'IBM-P740-01-LP1' to dns domain 'ad.baseos.qe'
Oct 17 08:26:19 ibm-p740-01-lp1 realmd: DNS Update for ibm-p740-01-lp1.ad.baseos.qe failed: ERROR_DNS_UPDATE_FAILED
Oct 17 08:26:19 ibm-p740-01-lp1 realmd: * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.HUZR4W -U Amy-admin ads keytab create
Oct 17 08:26:43 ibm-p740-01-lp1 realmd: Enter Amy-admin's password:
Oct 17 08:26:43 ibm-p740-01-lp1 realmd: * /usr/bin/systemctl enable sssd.service
Oct 17 08:26:43 ibm-p740-01-lp1 realmd: ln -s '/usr/lib/systemd/system/sssd.service' '/etc/systemd/system/multi-user.target.wants/sssd.servi
ce'
Oct 17 08:26:43 ibm-p740-01-lp1 systemd: Reloading.
Oct 17 08:26:43 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-beaker-backend, ignor
ing: Invalid argument
Oct 17 08:26:43 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-fwd-backend, ignoring
: Invalid argument
Oct 17 08:26:43 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:9] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:43 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/ntpd.service:10] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:43 ibm-p740-01-lp1 realmd: * /usr/bin/systemctl restart sssd.service
Oct 17 08:26:43 ibm-p740-01-lp1 systemd: Starting System Security Services Daemon...
Oct 17 08:26:43 ibm-p740-01-lp1 sssd: Starting up
Oct 17 08:26:44 ibm-p740-01-lp1 sssd[be[ad.baseos.qe]]: Starting up
Oct 17 08:26:44 ibm-p740-01-lp1 sssd[nss]: Starting up
Oct 17 08:26:44 ibm-p740-01-lp1 sssd[pam]: Starting up
Oct 17 08:26:44 ibm-p740-01-lp1 systemd: Started System Security Services Daemon.
Oct 17 08:26:44 ibm-p740-01-lp1 realmd: * /usr/bin/sh -c /usr/sbin/authconfig --update --enablesssd --enablesssdauth --enablemkhomedir --nostart && /usr/bin/systemctl enable oddjobd.service
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: Reloading.
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-beaker-backend, ignoring: Invalid argument
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-fwd-backend, ignoring: Invalid argument
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:9] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/ntpd.service:10] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: Reloading.
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-beaker-backend, ignoring: Invalid argument
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-fwd-backend, ignoring: Invalid argument
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:9] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/ntpd.service:10] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: Reloading.
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-beaker-backend, ignoring: Invalid argument
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:4] Failed to add dependency on beah-fwd-backend, ignoring: Invalid argument
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/beah-srv.service:9] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:45 ibm-p740-01-lp1 systemd: [/usr/lib/systemd/system/ntpd.service:10] Unknown lvalue 'ControlGroup' in section 'Service'
Oct 17 08:26:45 ibm-p740-01-lp1 realmd: * Successfully enrolled machine in realm



Maybe this output will be helpful
Comment 4 Patrik Kis 2013-10-23 11:51:26 EDT
After the test was rewritten to use lmishell this problem appeared also on x86_64.
The way how it is tested is the following:

    conn = lmishell.connect(HOST, USER, PASSWD)
    self.assertTrue(isinstance(conn, lmishell.LMIConnection), "Couldn't connect to remote provider")

    realmsrv = conn.root.cimv2.LMI_RealmdService.first_instance()
    self.assertTrue(realmsrv, "Unable to get first instance of LMI_RealmdService class")

    print "Joining to the domain %s with user %s" % (REALMD_DOMAIN,REALMD_USER)

    realmsrv.JoinDomain(Password=REALMD_USER_PASS, User=REALMD_USER, Domain=REALMD_DOMAIN)
    dom = realmsrv.Domain
    if (dom):
          print "Successfully joined to the domain: " + REALMD_DOMAIN
    self.assertTrue(dom, "Failed to join to the domain: " + REALMD_DOMAIN)


and it fails like this:

Joining to the domain ad.baseos.qe with user Amy-admin
F
======================================================================
FAIL: test_join (realmd_lmi.TestRealmdFunctions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "realmd_lmi.py", line 84, in test_join
    self.assertTrue(dom, "Failed to join to the domain: " + REALMD_DOMAIN)
AssertionError: Failed to join to the domain: ad.baseos.qe

----------------------------------------------------------------------
Ran 1 test in 26.667s

FAILED (failures=1)


But if the logs are checked at this point it shows that realmd is still joining and did not finish the process:

Oct 23 11:45:52 intel-piketon-02 dbus-daemon: dbus[652]: [system] Activating service name='org.freedesktop.realmd' (using servicehelper)
Oct 23 11:45:52 intel-piketon-02 dbus[652]: [system] Activating service name='org.freedesktop.realmd' (using servicehelper)
Oct 23 11:45:52 intel-piketon-02 dbus-daemon: dbus[652]: [system] Successfully activated service 'org.freedesktop.realmd'
Oct 23 11:45:52 intel-piketon-02 dbus[652]: [system] Successfully activated service 'org.freedesktop.realmd'
Oct 23 11:45:53 intel-piketon-02 realmd: * Resolving: _ldap._tcp.ad.baseos.qe
Oct 23 11:45:54 intel-piketon-02 realmd: * Performing LDAP DSE lookup on: 10.34.37.22
Oct 23 11:45:54 intel-piketon-02 realmd: * Performing LDAP DSE lookup on: 2620:52:0:2223::1:1
Oct 23 11:45:54 intel-piketon-02 realmd: * Performing LDAP DSE lookup on: 2620:52:0:2223:1dfe:a8ea:f0d8:380c
Oct 23 11:45:54 intel-piketon-02 realmd: * Successfully discovered: ad.baseos.qe
Oct 23 11:45:54 intel-piketon-02 realmd: * Required files: /usr/sbin/oddjobd, /usr/libexec/oddjob/mkhomedir, /usr/sbin/sssd, /usr/bin/net
Oct 23 11:45:54 intel-piketon-02 realmd: * LANG=C LOGNAME=root /usr/bin/net -s /var/cache/realmd/realmd-smb-conf.X5VN5W -U Amy-admin ads join ad.baseos.qe
HOSTNAME_SHORT variable contains: INTEL-PIKETON-0
user_principal=INTEL-PIKETON-0
REALM_DOMAIN_UPPER variable contains: AD.BASEOS.QE
REALM_DOMAIN_SHORT variable contains: AD


And if we wait for a while (sometimes it takes more than a minute) the system is joined to domain.
So the problem clearly is that
realmsrv.JoinDomain(Password=REALMD_USER_PASS, User=REALMD_USER, Domain=REALMD_DOMAIN)
return before the joining process is finished.
Comment 5 Tomas Smetana 2013-10-24 10:31:17 EDT
I'm having troubles starting the realmd service itself on PPC:

Oct 24 10:22:45 ibm-p740-01-lp4 systemd: Starting Realm and Domain Configuration...
Oct 24 10:22:45 ibm-p740-01-lp4 realmd: couldn't claim service name on DBus bus: org.freedesktop.realmd
Oct 24 10:22:45 ibm-p740-01-lp4 realmd: ** Message: couldn't claim service name on DBus bus: org.freedesktop.realmd
Oct 24 10:24:15 ibm-p740-01-lp4 systemd: realmd.service operation timed out. Terminating.

It's pretty possible the provider shouldn't timeout so soon: however if the realmd service doesn't start at all it will not help.
Comment 6 Patrik Kis 2013-10-24 10:39:18 EDT
Is it the same with realm command line tool?
Comment 7 Tomas Smetana 2013-10-25 04:19:01 EDT
(In reply to Patrik Kis from comment #6)
> Is it the same with realm command line tool?

I still don't completely understand what is happening here. Here's the theory:

* There are some AVC messages in the log: SELinux is apparently blocking something -- we need to report the AVCs to the SELinux guys
* Even in the permissive mode I'm unable to start realmd after an unsuccessful join attempt and I incline to blame the systemd socket activation mechanism for the issues: First: the D-Bus message times-out since systemd takes too long to start the daemon.  Second: looks like systemd claims the DBus service name and realmd fails to start manually afterwards.

I can add longer timeout to the DBus message send routine (it uses the default) to workaround the long socket activation time.  I'm not sure how big is the problem with the DBus systemd occupation though and I can't do anything about it.
Comment 8 David Spurek 2013-10-25 04:37:44 EDT
There were avc messages, but they should be fixed in the newest selinux-policy, see https://bugzilla.redhat.com/show_bug.cgi?id=1020301
Comment 11 Tomas Smetana 2013-12-12 04:09:21 EST
I have increased the timeout to DBUS_TIMEOUT_INFINITE, which should effectively mean we wait for the realmd daemon to report success/failure.
Comment 19 Ludek Smid 2014-06-13 09:06:45 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.