Bug 840421

Summary: [RFE][AAA] rhevm-manage-domains should contain rollback procedure in case of unexpected error during add domain procedure
Product: Red Hat Enterprise Virtualization Manager Reporter: Tomas Dosek <tdosek>
Component: ovirt-engine-configAssignee: Yair Zaslavsky <yzaslavs>
Status: CLOSED WONTFIX QA Contact: Pavel Stehlik <pstehlik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.1.0CC: acathrow, alonbl, bazulay, iheim, jkt, oourfali, Rhev-m-bugs, yzaslavs
Target Milestone: ---Keywords: FutureFeature, Reopened
Target Release: ---Flags: dyasny: Triaged+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-27 13:13:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1063095    
Attachments:
Description Flags
Engine.log
none
manage domains log
none
Dump of vdc_options table none

Description Tomas Dosek 2012-07-16 10:32:50 UTC
Created attachment 598413 [details]
Engine.log

Description of problem:
In case user adds IPA to RHEV-M using rhevm-manage-domains and the IPA server fails to communicate during kerberos key exchange, all domains that were previously added to RHEV-M are unusable and search through these domains returns
null results.

Note that precise timing is the key to reproducing this bug.

Version-Release number of selected component (if applicable):
si10

How reproducible:
100 %

Steps to Reproduce:
1. Have a si10 RHEV-M setup with internal and some other domain (in my case Active Directory domain)
2. Prepare a command for adding IPA domain + prepare iptables command to block all communication to and from IPA server (in my case it was random reboot of server)
3. Run the rhevm-manage-domain -action=add -interactive -domain=<domain> -user=<user> -provider=IPA
4. Right after putting in password block all communication to IPA server
  
Actual results:
During operation NullPointer exception pops up in terminal (not even in log but in terminal of rhevm-manage-domains command), the operation seems to complete successfully.

After ovirt-engine restart and performing some search in rhevm no result is found - even though all the domains in RHEV-M have some users in db.

This can be fixed only by total cleanup and reinstallation of RHEV-M

Expected results:
IPA is either added to RHEV-M or a rollback operation arises and RHEV-M is in functional state.

None NPE is shown to user (it's written into log) in case of adding another domain fails. Some nice error message is shown to user and definitely not success message.

Additional info:
Attaching full rhev-m log

Comment 2 Roy Golan 2012-07-19 08:50:15 UTC
pls supply the engine-manage-domains.log and a dump of the vdc_options table

Comment 3 Tomas Dosek 2012-07-19 09:52:53 UTC
So the error I get during adding IPA domain:

[root@change-fqdn log]# rhevm-manage-domains -action=add -provider=IPA -domain=brq-ipa.rhev.lab.eng.brq.redhat.com -user=vdcadmin -interactive
Enter password:

Error:  exception message: java.lang.NullPointerException
	at org.ovirt.engine.core.utils.kerberos.KerberosConfigCheck$KerberosUtilCallbackHandler.handle(KerberosConfigCheck.java:85)
	at javax.security.auth.login.LoginContext$SecureCallbackHandler$1.run(LoginContext.java:970)
	at javax.security.auth.login.LoginContext$SecureCallbackHandler$1.run(LoginContext.java:967)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext$SecureCallbackHandler.handle(LoginContext.java:966)
	at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:824)
	at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:671)
	at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:559)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:616)
	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:784)
	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:698)
	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:696)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:695)
	at javax.security.auth.login.LoginContext.login(LoginContext.java:594)
	at org.ovirt.engine.core.utils.kerberos.KerberosConfigCheck.checkAuthentication(KerberosConfigCheck.java:225)
	at org.ovirt.engine.core.utils.kerberos.KerberosConfigCheck.authenticate(KerberosConfigCheck.java:216)
	at org.ovirt.engine.core.utils.kerberos.KerberosConfigCheck.validateKerberosInstallation(KerberosConfigCheck.java:155)
	at org.ovirt.engine.core.utils.kerberos.KerberosConfigCheck.checkInstallation(KerberosConfigCheck.java:144)
	at org.ovirt.engine.core.utils.kerberos.ManageDomains.checkKerberosConfiguration(ManageDomains.java:634)
	at org.ovirt.engine.core.utils.kerberos.ManageDomains.testConfiguration(ManageDomains.java:784)
	at org.ovirt.engine.core.utils.kerberos.ManageDomains.addDomain(ManageDomains.java:452)
	at org.ovirt.engine.core.utils.kerberos.ManageDomains.runCommand(ManageDomains.java:250)
	at org.ovirt.engine.core.utils.kerberos.ManageDomains.main(ManageDomains.java:175)

WARNING, domain: rhev.lab.eng.brq.redhat.com may not be functional: Failure while testing domain rhev.lab.eng.brq.redhat.com. Details: Kerberos error. Please check log for further details.
WARNING: No permissions were added to the Engine. Login either with the internal admin user or with another configured user.
Successfully added domain brq-ipa.rhev.lab.eng.brq.redhat.com. oVirt Engine restart is required in order for the changes to take place (service ovirt-engine restart).
Manage Domains completed successfully

I'll also attach requested files - in next comment

Comment 4 Tomas Dosek 2012-07-19 09:54:26 UTC
Created attachment 599103 [details]
manage domains log

Comment 5 Tomas Dosek 2012-07-19 10:06:26 UTC
Created attachment 599106 [details]
Dump of vdc_options table

Comment 6 Roy Golan 2012-07-19 11:10:44 UTC
Your AdUserPassword record is empty and that's why domains can't be searched.

few bugs were introduced due to new flag added to manage-domains in si10 and have fixes for SI11:

https://bugzilla.redhat.com/839264

*** This bug has been marked as a duplicate of bug 839264 ***

Comment 7 Tomas Dosek 2012-07-19 11:19:38 UTC
Ok Roy, but you might miss this section of bug's description:

"Expected results:
IPA is either added to RHEV-M or a rollback operation arises and RHEV-M is in functional state.

None NPE is shown to user (it's written into log) in case of adding another domain fails. Some nice error message is shown to user and definitely not success message."

And I believe this really deserves a fix. I mean do we have some rollback procedure in case of unexpected exception? (as we can see on a bug that you mentioned - there's none). I'm reopening this as an RFE

Comment 9 Roy Golan 2012-07-19 11:41:56 UTC
(In reply to comment #7)
> Ok Roy, but you might miss this section of bug's description:
> 
> "Expected results:
> IPA is either added to RHEV-M or a rollback operation arises and RHEV-M is
> in functional state.
> 
> None NPE is shown to user (it's written into log) in case of adding another
> domain fails. Some nice error message is shown to user and definitely not
> success message."
> 
> And I believe this really deserves a fix. I mean do we have some rollback
> procedure in case of unexpected exception? (as we can see on a bug that you
> mentioned - there's none). I'm reopening this as an RFE

Generally speaking rollback is good in case you left the system in raw state. that's why the cloned bug will take care of that.

Just to make things clear manage-domains runs in few steps when adding a domain:

1. Kerberos auth
2. ldap query to get the user id
3. creates  krb5.conf file
4. test all domains by kerberos auth

If the tool failed with unexpected execption in stage 4 after all configuration have been gather succesfully then I don't see any reason for the rollback. your system is ready to go but you may have some network issues with your servers

if you failed in 1,2,3 you probably won't get a success message (I need to reassure this)

Comment 12 Yair Zaslavsky 2012-07-23 08:43:00 UTC
Regardless of the NPE - 
What we can implement is a rollback mechanism based on transactions -
rhevm- manage-domains calls 7 times rhevm-config (once for each key stored at db) using the rhevm-config shell script.

We can:
A. Call the rhevm-config directly (i.e - using the EngineConfigLogic class or other classes from its jar)
B. Create a transaction wrapping the 7 insert/update calls.
C. If an error occurs during the flow that is wrapped with transaction, a rollback will occur, and the db will be kept in consistent state.


However , such a change is in a scope which is too big for 3.1.
Suggesting this to future version.

Comment 13 Itamar Heim 2012-07-29 12:22:39 UTC
rolling back the db without rolling back the krb5.conf file won't help for operations of update/delete?

Comment 16 Alon Bar-Lev 2014-06-11 13:24:32 UTC
the new ldap provider does not have a utility, its configuration is based on files, so it is much easier to configure.

I am keeping this opened for now, although engine-manage-domain unlikely to be changed.

Comment 17 Alon Bar-Lev 2014-06-27 13:13:26 UTC
OK, closing as wontfix as manage-domains unlikely to be modified in future.