Bug 1303163

Summary: Cannot login after upgrade from 3.5 to 3.6
Product: [oVirt] ovirt-engine Reporter: Marcelo Leandro <marceloltmm>
Component: Setup.EngineAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: Gonza <grafuls>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.6.1.3CC: bugs, didi, gklein, lveyde, mgoldboi, oourfali, rmartins, sbonazzo, stirabos
Target Milestone: ovirt-3.6.3Flags: rule-engine: ovirt-3.6.z+
rule-engine: exception+
mgoldboi: planning_ack+
rule-engine: devel_ack+
pstehlik: testing_ack+
Target Release: 3.6.3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-18 11:14:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log, server.log and setup.log none

Description Marcelo Leandro 2016-01-29 18:15:18 UTC
Created attachment 1119492 [details]
engine.log, server.log and setup.log

After engine-setup from ovirt-engine-3.5.6.2-1.el7.centos to ovirt-engine-3.6.1.3-1.el7.centos, I'm not able to login anymore. The
engine.log says:
 
Caused by: org.codehaus.jackson.map.JsonMappingException: Invalid type id 'org.ovirt.engine.core.common.businessentities.DiskImage' (for id
type 'Id.class'): no such class found (through reference chain: org.ovirt.engine.core.common.action.AddVmFromSnapshotParameters["vm"]->org.ovirt.engine.core.common.businessentities.VM["diskList"])
 
After some mailing list discussion, seems like the issue was identified:
 
"CommandBase tries to execute LoginCommand, but before the command execution it loads content of commands cache and here comes the issue:  here's stored AddVmFromSnapshotCommand which contains DiskImage as a parameter and DiskImage implementation has changed between 3.5 and 3.6."
 
and
 
"The problem is in the command_entities table that is not cleaned up and has two records after taskcleaner.sh is invoked by engine-setup.
Since engine tries to deserialize classes defined in this table and as Martin noted changes between 3.5 and 3.6 changed some classes, we got this exception.
We should not allow leftovers to exists in this table during the upgrade process and taskcleaner utility should handle that as well."
 
The workaround seems to be:
 
Run the following before the upgrade
 psql -U engine -c "DELETE from command_entities;" <database name>
 
Additional question:
 
My concern now is that I have already upgraded to 3.6, so now I wonder if there's some post upgrade workaround.

Comment 1 Yedidyah Bar David 2016-01-31 07:52:20 UTC
Moving to Eli as per the discussion on the mailing list. Thanks for the report.

Comment 2 Red Hat Bugzilla Rules Engine 2016-01-31 22:50:09 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 3 Gonza 2016-02-18 10:49:30 UTC
Verified with upgrade from:
rhevm-3.5.8-0.1.el6ev.noarch
to: 
rhevm-3.6.3.1-0.1.el6.noarch

Started a template import and once there were entries on command_entities table I stopped the ovirt-engine service.
Ran engine-setup and came across the following:
[ INFO  ] Cleaning async tasks and compensations
          The following system tasks have been found running in the system:
          Task ID:           ac0462e5-f18e-4492-a2d1-4e794c01ebf3
          Task Name:         ImportVmTemplateCommand       
          Task Description:  Importing a temaplte from an export domain
          Started at:        30
          DC Name:           Default                       
          The following commands have been found running in the system:
          The following compensations have been found running in the system:
          Would you like to try to wait for that?
          (Answering "no" will stop the upgrade (Yes, No) Yes

Execution of setup completed successfully
Web Admin is accessible and I am able to login.