Bug 985940 - authentication does not failover when ldap server is unreachable
authentication does not failover when ldap server is unreachable
Status: CLOSED DUPLICATE of bug 973566
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
Unspecified Unspecified
urgent Severity high
: ---
: 3.2.2
Assigned To: Yair Zaslavsky
infra
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-18 10:38 EDT by Matthew Davis
Modified: 2016-02-10 14:34 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-21 08:20:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 16859 None None None Never

  None (edit)
Description Matthew Davis 2013-07-18 10:38:18 EDT
Description of problem:

We have a number of machines in our IdM environment. Unfortunatly, the first/highest ranked server listed in our IdM environment is unreachable by one of our RHEV-M hosts.

In previous versions of RHEV-M, the first login after an ovirt-engine restart would take a while to complete as it times out in its attempt to hit the first server. But then subsequent logins would go directly to the 2nd ldap server listed.

The behavior now does not failover. It just denys logins.


Version-Release number of selected component (if applicable):
[root@rhevm.rdu ovirt-engine]$ rpm -q rhevm
rhevm-3.2.1-0.39.el6ev.noarch


How reproducible:
Everytime

Steps to Reproduce:
1. Add an IdM environment that has many many masters
2. Make rhev-m unavailable to the highest ranked IdM server in the domain
3. Attempt a login

Actual results:
Never gets a login

Expected results:
Possibly the first login takes longer, but subsequent logins work quickly.

Additional info:
Comment 1 Matthew Davis 2013-07-18 10:40:04 EDT
If it matters, this is our dns config. And idm1.phx is unreachable by my rhevm host.

;; ANSWER SECTION:
_kerberos._tcp.salab.redhat.com. 300 IN SRV     4 100 88 idm2.rdu.salab.redhat.com.
_kerberos._tcp.salab.redhat.com. 300 IN SRV     0 100 88 idm1.phx.salab.redhat.com.
_kerberos._tcp.salab.redhat.com. 300 IN SRV     1 100 88 idm1.rdu.salab.redhat.com.
_kerberos._tcp.salab.redhat.com. 300 IN SRV     3 100 88 idm2.phx.salab.redhat.com.


I get the following error in rhev-m logs.

2013-07-18 10:22:32,413 ERROR [org.ovirt.engine.core.bll.adbroker.GetRootDSE] (ajp-/127.0.0.1:8702-38) Failed to query rootDSE for LDAP server LDAP://idm1.phx.salab.redhat.com:389 due to connection timeout
2013-07-18 10:22:32,415 ERROR [org.ovirt.engine.core.bll.adbroker.DirectorySearcher] (ajp-/127.0.0.1:8702-38) Failed ldap search server LDAP://idm1.phx.salab.redhat.com:389 using user mdavis@SALAB.REDHAT.COM due to connection timeout. We
 should try the next server
2013-07-18 10:22:32,415 ERROR [org.ovirt.engine.core.bll.adbroker.LdapBrokerCommandBase] (ajp-/127.0.0.1:8702-38) Failed to run command LdapAuthenticateUserCommand. Domain is salab.redhat.com. User is mdavis.
2013-07-18 10:22:32,416 ERROR [org.ovirt.engine.core.bll.LoginAdminUserCommand] (ajp-/127.0.0.1:8702-38) USER_FAILED_TO_AUTHENTICATE : mdavis
2013-07-18 10:22:32,416 WARN  [org.ovirt.engine.core.bll.LoginAdminUserCommand] (ajp-/127.0.0.1:8702-38) CanDoAction of action LoginAdminUser failed. Reasons:USER_FAILED_TO_AUTHENTICATE



It even says it should try the next server, but never does.
Comment 2 Matthew Davis 2013-07-18 15:08:34 EDT
A workaround is to use the -ldapServers parameter in rhevm-manage-domains.

# rhevm-manage-domains -action=add -domain=SALAB.REDHAT.COM -provider=IPA -user=admin -interactive -ldapServers=$SERVER1,$SERVER2

This is working as a suitable workaround.
Comment 3 Yair Zaslavsky 2013-07-21 08:15:17 EDT
May we get full logs?
I think I know what is causing this, but I would like to be sure.
Comment 4 Yair Zaslavsky 2013-07-21 08:16:03 EDT
Just so I'm understood - engine.log ( + rotations like engine.log.1, etc.. if exists) and server.log
Comment 5 Yair Zaslavsky 2013-07-21 08:17:51 EDT
Attached an external tracker to oVirt gerrit with patch that might solve the issue
Comment 6 Yair Zaslavsky 2013-07-21 08:20:45 EDT
Actually, looking again at the bug description, the patch DOES solve this, we saw similar issue with bugs:

BZ973566
BZ974148

Moving to closed-duplicate.

*** This bug has been marked as a duplicate of bug 973566 ***

Note You need to log in before you can comment on or make changes to this bug.