Bug 1302374 - [Upgrade] Upgrade 3.5->3.6 failed when a system is using the legacy engine-manage-domains
[Upgrade] Upgrade 3.5->3.6 failed when a system is using the legacy engine-ma...
Status: CLOSED NOTABUG
Product: ovirt-engine
Classification: oVirt
Component: AAA (Show other bugs)
3.6.2
Unspecified Unspecified
unspecified Severity urgent (vote)
: ovirt-3.6.3
: ---
Assigned To: Martin Perina
Ondra Machacek
:
Depends On:
Blocks: RHEV3.6Upgrade
  Show dependency treegraph
 
Reported: 2016-01-27 11:18 EST by Gil Klein
Modified: 2016-02-04 03:13 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-04 03:13:07 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
gklein: ovirt‑3.6.z?
mgoldboi: blocker+
mgoldboi: planning_ack+
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
ovirt engine setup log (3.05 MB, text/plain)
2016-01-27 11:18 EST, Gil Klein
no flags Details
server.log (318.72 KB, text/plain)
2016-02-02 03:36 EST, Gil Klein
no flags Details

  None (edit)
Description Gil Klein 2016-01-27 11:18:35 EST
Created attachment 1118830 [details]
ovirt engine setup log

Description of problem:
When upgrading a 3.5.7 engine to 3.6.3 while using the legacy engine-manage-domains (AD), upgrade failes with an error

[ INFO  ] Rolling back database schema
[ INFO  ] Clearing Engine database engine
[ INFO  ] Restoring Engine database engine
[ INFO  ] Restoring file '/var/lib/ovirt-engine/backups/engine-20160127154812.6H7WvC.dump' to database localhost:engine.
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20160127153009-dvdf1g.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20160127155435-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed


2016-01-27 15:49:09 DEBUG otopi.context context._executeMethod:156 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 146, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/config/aaajdbc.py", line 381, in _misc
    self._setupAdminUser()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/config/aaajdbc.py", line 281, in _setupAdminUser
    name=adminUser,
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/config/aaajdbc.py", line 61, in _userExists
    envAppend=toolEnv,
  File "/usr/lib/python2.6/site-packages/otopi/plugin.py", line 946, in execute
    command=args[0],
RuntimeError: Command '/usr/bin/ovirt-aaa-jdbc-tool' failed to execute
2016-01-27 15:49:09 ERROR otopi.context context._executeMethod:165 Failed to execute stage 'Misc configuration': Command '/usr/bin/ovirt-aaa-jdbc-tool' failed to execute


Version-Release number of selected component (if applicable):
From: rhevm-3.5.7-0.1.el6ev.noarch
To: rhevm-3.6.2.6-0.1.el6.noarch


How reproducible:
100% on this system


Steps to Reproduce:
1. Setup a 3.5.7 engine
2. Configure AD authentication using engine-manage-domains 
3. Upgrade the system to 3.6.3

Actual results:
Upgrade is failing:
[ ERROR ] Execution of setup failed
 


Expected results:
Upgrade should succeeded 


Additional info:
Comment 1 Gil Klein 2016-01-27 11:26:30 EST
Found a time gap between the engine and the AD server. It might be related.

# engine-manage-domains validate
Failure while testing domain qa.lab.tlv.redhat.com. Details: Authentication Failed. The Engine clock is not synchronized with directory services (must be within 5 minutes difference). Please verify the clocks are synchronized
Comment 2 Gil Klein 2016-01-27 11:54:03 EST
Syncing AD and engine time did not help.

I also notice internal.properties is missing:

# ls -l /etc/ovirt-engine/aaa/internal.properties
ls: cannot access /etc/ovirt-engine/aaa/internal.properties: No such file or directory
Comment 3 Gil Klein 2016-02-02 03:36 EST
Created attachment 1120357 [details]
server.log
Comment 4 Martin Perina 2016-02-02 05:13:43 EST
Hi Gil,

I wasn't able to reproduce it on my machine using these steps:

1. Install rhevm-3.5.7-0.1, configure AD access using manage-domains
2. Add repos for rhevm 3.6.3-1
3. Execute:
     yum update -y 'rhevm-setup*'
     engine-setup

Everything went fine, I haven't found any error in the logs after upgrade and I was able to login successfully into webadmin using both admin@internal and user from AD domain.

I used latest JBoss EAP 6.4.6 (jboss-as-server-7.5.6-1) for 3.5.7, have also jboss been upgraded in your case or not?

Did I miss anything from your steps? Did this happen only on one machine?

I will try again to look into you logs, but at the moment I don't see a reason why it failed in your case.
Comment 5 Gil Klein 2016-02-02 06:09:28 EST
(In reply to Martin Perina from comment #4)
> Hi Gil,
> 
> I wasn't able to reproduce it on my machine using these steps:
> 
> 1. Install rhevm-3.5.7-0.1, configure AD access using manage-domains
> 2. Add repos for rhevm 3.6.3-1
> 3. Execute:
>      yum update -y 'rhevm-setup*'
>      engine-setup
> 
> Everything went fine, I haven't found any error in the logs after upgrade
> and I was able to login successfully into webadmin using both admin@internal
> and user from AD domain.
> 
So I guess something is something more specific on this system.

> I used latest JBoss EAP 6.4.6 (jboss-as-server-7.5.6-1) for 3.5.7, have also
> jboss been upgraded in your case or not?
No Jboss upgrade was done. I've used 7.5.5.-2

# grep "jboss-as-server" /var/log/yum.log 
Jan 20 16:52:07 Installed: jboss-as-server-7.5.5-2.Final_redhat_3.1.ep6.el6.noarch

> 
> Did I miss anything from your steps? Did this happen only on one machine?
> 
> I will try again to look into you logs, but at the moment I don't see a
> reason why it failed in your case.
So I guess my assumption was wrong, and the case is related to something else on this system.

I believe the problem has something to do with this failure [1]

What it the purpose of this call, can it be fixed somehow, and should we fail an upgrade if it fails?

[1]  
# /usr/bin/ovirt-aaa-jdbc-tool --db-config=/etc/ovirt-engine/aaa/internal.properties query --what=user --pattern=name=admin
# echo $?
1
Comment 6 Martin Perina 2016-02-02 10:17:47 EST
I investigated logs again and I haven't found any reason why execution of ovirt-aaa-jdbc-tool during engine-setup should fail with following exception:

Exception in thread "main" org.jboss.modules.ModuleLoadError: org.ovirt.engine.api.ovirt-engine-extensions-api:main
        at org.jboss.modules.ModuleLoadException.toError(ModuleLoadException.java:78)                                                                         
        at org.jboss.modules.Module.getPathsUnchecked(Module.java:1392)                                                                                                                                       
        at org.jboss.modules.Module.loadModuleClass(Module.java:563)                                                                                                                                          
        at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:205)                                                                                                                          
        at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:459)                                                                                                  
        at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:408)                                                                                                    
        at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:389)                                                                                                           
        at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:134)                                                                                                                  
        at org.ovirt.engine.extension.aaa.jdbc.binding.cli.Cli.<clinit>(Cli.java:86)                                                                                                                          
        at java.lang.Class.forName0(Native Method)                                                                                                                                                            
        at java.lang.Class.forName(Class.java:278)                                                                                                                                                            
        at org.jboss.modules.Module.run(Module.java:302)                                                                                                                                                      
        at org.jboss.modules.Main.main(Main.java:473)

Every steps in engine-setup flow successful up to this point.

The only difference between your and mine setup is that you have rhevm-reports configured, so I will try to reproduce again with reports configured.

In the meantime, is this setup still available? If so could you please do following:

1. Please verify if ovirt-engine service is running, Iit should be stopped successfully according to log, but please check processes, if it's stucked somewhere, please kill.
2. Please execute engine-setup again so we know if this error is persistent or just some random JBoss bug

Thanks
Comment 7 Gil Klein 2016-02-02 11:13:36 EST
(In reply to Martin Perina from comment #6)
> I investigated logs again and I haven't found any reason why execution of
> ovirt-aaa-jdbc-tool during engine-setup should fail with following exception:
> 
> Exception in thread "main" org.jboss.modules.ModuleLoadError:
> org.ovirt.engine.api.ovirt-engine-extensions-api:main
>         at
> org.jboss.modules.ModuleLoadException.toError(ModuleLoadException.java:78)  
> 
>         at org.jboss.modules.Module.getPathsUnchecked(Module.java:1392)     
> 
>         at org.jboss.modules.Module.loadModuleClass(Module.java:563)        
> 
>         at
> org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:205)   
> 
>         at
> org.jboss.modules.ConcurrentClassLoader.
> performLoadClassUnchecked(ConcurrentClassLoader.java:459)                   
> 
>         at
> org.jboss.modules.ConcurrentClassLoader.
> performLoadClassChecked(ConcurrentClassLoader.java:408)                     
> 
>         at
> org.jboss.modules.ConcurrentClassLoader.
> performLoadClass(ConcurrentClassLoader.java:389)                            
> 
>         at
> org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:
> 134)                                                                        
> 
>         at
> org.ovirt.engine.extension.aaa.jdbc.binding.cli.Cli.<clinit>(Cli.java:86)   
> 
>         at java.lang.Class.forName0(Native Method)                          
> 
>         at java.lang.Class.forName(Class.java:278)                          
> 
>         at org.jboss.modules.Module.run(Module.java:302)                    
> 
>         at org.jboss.modules.Main.main(Main.java:473)
> 
> Every steps in engine-setup flow successful up to this point.
> 
> The only difference between your and mine setup is that you have
> rhevm-reports configured, so I will try to reproduce again with reports
> configured.
> 
> In the meantime, is this setup still available? If so could you please do
> following:
> 
> 1. Please verify if ovirt-engine service is running, Iit should be stopped
> successfully according to log, but please check processes, if it's stucked
> somewhere, please kill.
It is stopped completely during the upgrade
> 2. Please execute engine-setup again so we know if this error is persistent
> or just some random JBoss bug
100% reproduced on this system on 2 additional attempts  
> 
> Thanks
Comment 8 Gil Klein 2016-02-04 03:13:07 EST
Turns out to be caused by a miss configured file, added manually under   /etc/ovirt-engine/engine.conf.d/ as "1-ovirt-engine.conf"

The added file was a copy of a file containing the defaults. Because of
the numeric prefix it sorts after the 10-setup-... file, and one of its
effects is that it resets the ENGINE_JAVA_MODULEPATH variable. In 3.6
the modules have been moved to subdirectories (common and tools) and
this means that the tools won't find them, because 1-ovirt-engine.conf
instruct them to look only in /usr/share/ovirt-engine/modules, and not
in the subdirectories.

To workaround it, I've:
1. Renamed the file to "99-increase-heap-size.conf"
2. Made sure the new file only includes the minimal settings needed to be override:
 ENGINE_HEAP_MIN=1g
 ENGINE_HEAP_MAX=2g 

engine-setup passed this phase, as soon as I've applied those changes.

Note You need to log in before you can comment on or make changes to this bug.