Bug 973566 - [rhevm-manage-domains] RHEVM doesn't try the next LDAP server when child domain controller not active.
[rhevm-manage-domains] RHEVM doesn't try the next LDAP server when child doma...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Linux
unspecified Severity high
: ---
: 3.3.0
Assigned To: Martin Perina
Ondra Machacek
infra
: Reopened
: 985940 1032143 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-12 04:43 EDT by vvyazmin@redhat.com
Modified: 2016-02-10 14:18 EST (History)
13 users (show)

See Also:
Fixed In Version: is7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-11 09:27:42 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
## Logs rhevm (410.64 KB, application/x-xz)
2013-06-12 04:43 EDT, vvyazmin@redhat.com
no flags Details
oVirt 3.3 test engine.log (7.08 KB, application/x-compressed-tar)
2013-07-11 09:17 EDT, Martin Perina
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 803513 None None None Never
oVirt gerrit 16859 None None None Never

  None (edit)
Description vvyazmin@redhat.com 2013-06-12 04:43:01 EDT
Created attachment 760024 [details]
## Logs rhevm

Description of problem: RHEVM doesn't try the next LDAP server when child domain controller not active.

Version-Release number of selected component (if applicable):
RHEVM 3.2 - SF17.1 environment: 

RHEVM: rhevm-3.2.0-11.28.el6ev.noarch 
VDSM: vdsm-4.10.2-21.0.el6ev.x86_64 
LIBVIRT: libvirt-0.10.2-18.el6_4.5.x86_64 
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.3.x86_64 
SANLOCK: sanlock-2.6-2.el6.x86_64

How reproducible:
100%

Related to BZ675749

Steps to Reproduce:
1. Create Windows Domain Controller (Master Domain Controller) - qa1-tlv.qa.lab.tlv.redhat.com
2. Add additional (Child) Domain Controller (Slave Domain Controller) - qa2-tlv.qa.lab.tlv.redhat.com
3. Registrar RHEVM to domain with “rhevm-manage-domains” tool.
1. rhevm-manage-domains -action=add -domain=qa.lab.tlv.redhat.com -user=kokomen -interactive -addPermissions -provider=ActiveDirectory 
4. Verify that all works OK, and you can login with LDAP user.
5. Power off Child Domain Controller
  
Actual results:
Failed login with LDAP user, because RHEVM continue send request Child Domain Controller, and  doesn't try the next LDAP server

Expected results:
Succeed login with LDAP user

Impact on user:
Failed login with LDAP user

Workaround:
In /etc/hosts file redirect IP of Master Domain Controller to Child Domain Controller hostname

Additional info:

/var/log/ovirt-engine/engine.log

2013-06-05 15:10:20,332 ERROR [org.ovirt.engine.core.bll.adbroker.GetRootDSE] (QuartzScheduler_Worker-9) [1046cc64] Failed to query rootDSE for LDAP server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 due to connection timeout
2013-06-05 15:10:20,332 ERROR [org.ovirt.engine.core.bll.adbroker.DirectorySearcher] (QuartzScheduler_Worker-9) [1046cc64] Failed ldap search server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 using user vdcadmin@QA.LAB.TLV.REDHAT.COM due t
o connection timeout. We should try the next server
2013-06-05 15:10:20,332 ERROR [org.ovirt.engine.core.bll.adbroker.LdapBrokerCommandBase] (QuartzScheduler_Worker-9) [1046cc64] Failed to run command LdapSearchUserByQueryCommand. Domain is qa.lab.tlv.redhat.com. User is vdcadmin@QA.LAB.T
LV.REDHAT.COM.}

/var/log/vdsm/vdsm.log
Comment 1 Yair Zaslavsky 2013-06-25 11:41:08 EDT
Worth mentioning that the initial order of ldap servers depends on the priorities of the SRV records (dig SRV _ldap._tcp.<DNS_DOMAIN> )
Comment 2 Martin Perina 2013-07-11 09:17:31 EDT
Created attachment 772217 [details]
oVirt 3.3 test engine.log
Comment 3 Martin Perina 2013-07-11 09:27:42 EDT
I've not been able to reproduce this bug on current oVirt 3.3 codebase, here's the test steps:
 
1) Add domain qa.lab.tlv.redhat.com

$ ./bin/engine-manage-domains -action=add -domain=qa.lab.tlv.redhat.com -user=vdcadmin -interactive -addPermissions -provider=ActiveDirectory
Enter password:

Successfully added domain qa.lab.tlv.redhat.com. oVirt Engine restart is required in order for the changes to take place (service ovirt-engine restart).
Manage Domains completed successfully

$ ./bin/engine-manage-domains -action=list
Domain: qa.lab.tlv.redhat.com
        User name: vdcadmin@QA.LAB.TLV.REDHAT.COM
Manage Domains completed successfully

$ ./bin/engine-manage-domains -action=validate
Cannot connect to LDAP server qa2-tlv.qa.lab.tlv.redhat.com:389. Trying next LDAP server in list (if exists)
Domain qa.lab.tlv.redhat.com is valid.
The configured user for domain qa.lab.tlv.redhat.com is vdcadmin@QA.LAB.TLV.REDHAT.COM
Manage Domains completed successfully


2) Start engine and log in as vdcadmin@qa.lab.tlv.redhat.com => user logged in successfully
   (I've added engine.log as attachment and I've also added logging to see what LDAP servers are configured for domain)


If I repeat those steps using RHEVM 3.2 SF18, there's an error in engine.log:

2013-07-11 15:02:25,193 ERROR [org.ovirt.engine.core.bll.adbroker.GetRootDSE] (QuartzScheduler_Worker-1) Failed to query rootDSE for LDAP server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 due to connection timeout
2013-07-11 15:02:25,200 ERROR [org.ovirt.engine.core.bll.adbroker.DirectorySearcher] (QuartzScheduler_Worker-1) Failed ldap search server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 using user vdcadmin@QA.LAB.TLV.REDHAT.COM due to connection timeout. We should try the next server
2013-07-11 15:02:25,200 ERROR [org.ovirt.engine.core.bll.adbroker.LdapBrokerCommandBase] (QuartzScheduler_Worker-1) Failed to run command LdapSearchUserByQueryCommand. Domain is qa.lab.tlv.redhat.com. User is vdcadmin@QA.LAB.TLV.REDHAT.COM.}
2013-07-11 15:02:25,201 WARN  [org.ovirt.engine.core.bll.DbUserCacheManager] (QuartzScheduler_Worker-1) User vdcadmin@QA.LAB.TLV.REDHAT.COM not found in directory sevrer, its status switched to InActive
2013-07-11 15:02:55,225 ERROR [org.ovirt.engine.core.bll.adbroker.GetRootDSE] (ajp-/127.0.0.1:8702-4) Failed to query rootDSE for LDAP server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 due to connection timeout
2013-07-11 15:02:55,226 ERROR [org.ovirt.engine.core.bll.adbroker.DirectorySearcher] (ajp-/127.0.0.1:8702-4) Failed ldap search server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 using user vdcadmin@QA.LAB.TLV.REDHAT.COM due to connection timeout. We should try the next server
2013-07-11 15:02:55,227 ERROR [org.ovirt.engine.core.bll.adbroker.LdapBrokerCommandBase] (ajp-/127.0.0.1:8702-4) Failed to run command LdapAuthenticateUserCommand. Domain is qa.lab.tlv.redhat.com. User is vdcadmin.}
2013-07-11 15:02:55,228 ERROR [org.ovirt.engine.core.bll.LoginAdminUserCommand] (ajp-/127.0.0.1:8702-4) USER_FAILED_TO_AUTHENTICATE : vdcadmin
2013-07-11 15:02:55,229 WARN  [org.ovirt.engine.core.bll.LoginAdminUserCommand] (ajp-/127.0.0.1:8702-4) CanDoAction of action LoginAdminUser failed. Reasons:USER_FAILED_TO_AUTHENTICATE


and user vdcadmin@qa.lab.tlv.redhat.com cannot log in.
Comment 4 Martin Perina 2013-07-15 03:51:07 EDT
I've found out that there's some error even in oVirt 3.3. When I modify the list of domains LDAP servers (so the turned off servers are returned first), user cannot log in and following errors appear:

2013-07-15 09:45:51,180 INFO  [org.ovirt.engine.core.bll.adbroker.DirectorySearcher] (http--0.0.0.0-8080-1) Ldap server list: LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389, LDAP://qa1.qa.lab.tlv.redhat.com:389
2013-07-15 09:46:21,753 ERROR [org.ovirt.engine.core.bll.adbroker.LdapSearchExceptionHandler] (http--0.0.0.0-8080-1) Error in communicating with LDAP server qa2-tlv.qa.lab.tlv.redhat.com:389; nested exception is javax.naming.CommunicationException: qa2-tlv.qa.lab.tlv.redhat.com:389 [Root exception is java.net.SocketTimeoutException: connect timed out]
2013-07-15 09:46:21,756 ERROR [org.ovirt.engine.core.bll.adbroker.DirectorySearcher] (http--0.0.0.0-8080-1) Failed ldap search server LDAP://qa2-tlv.qa.lab.tlv.redhat.com:389 using user vdcadmin@QA.LAB.TLV.REDHAT.COM due to connection timeout. We should try the next server
2013-07-15 09:46:21,757 ERROR [org.ovirt.engine.core.bll.adbroker.LdapBrokerCommandBase] (http--0.0.0.0-8080-1) Failed to run command LdapAuthenticateUserCommand. Domain is qa.lab.tlv.redhat.com. User is vdcadmin.
2013-07-15 09:46:21,758 ERROR [org.ovirt.engine.core.bll.LoginAdminUserCommand] (http--0.0.0.0-8080-1) USER_FAILED_TO_AUTHENTICATE : vdcadmin
2013-07-15 09:46:21,758 WARN  [org.ovirt.engine.core.bll.LoginAdminUserCommand] (http--0.0.0.0-8080-1) CanDoAction of action LoginAdminUser failed. Reasons:USER_FAILED_TO_AUTHENTICATE

I will continue to investigate this.
Comment 5 Martin Perina 2013-07-15 08:20:33 EDT
The bug has already been partially resolved upstream. The only remaining error was in root DSE query code block: when the first LDAP server in list was not available, there was an uncaught RuntimeException that prevents querying next LDAP server and makes login unsuccessful at once.
Comment 6 Yair Zaslavsky 2013-07-21 08:20:46 EDT
*** Bug 985940 has been marked as a duplicate of this bug. ***
Comment 10 Yair Zaslavsky 2013-12-07 07:30:07 EST
*** Bug 1032143 has been marked as a duplicate of this bug. ***
Comment 11 Itamar Heim 2014-01-21 17:23:08 EST
Closing - RHEV 3.3 Released
Comment 12 Itamar Heim 2014-01-21 17:24:14 EST
Closing - RHEV 3.3 Released
Comment 13 Itamar Heim 2014-01-21 17:27:54 EST
Closing - RHEV 3.3 Released

Note You need to log in before you can comment on or make changes to this bug.