Bug 873775

Summary: rhevm-upgrade: Goes into infinite loop when it fails to kill tasks found in 'business_entity_snapshot' table.
Product: Red Hat Enterprise Virtualization Manager Reporter: Tareq Alayan <talayan>
Component: ovirt-engine-setupAssignee: Moran Goldboim <mgoldboi>
Status: CLOSED NOTABUG QA Contact: Pavel Stehlik <pstehlik>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 3.1.0CC: bazulay, dyasny, iheim, Rhev-m-bugs, sgrinber, ykaul
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra integration
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-19 10:48:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
upgrade_log
none
vdsm.log.28.xz from 6.Nov
none
vdsm.log.29.x from 6.Nov
none
engine.log none

Description Tareq Alayan 2012-11-06 16:49:44 UTC
Description of problem:
Unable to upgrade from si22 to si24


Steps to Reproduce:
1. rhevm-upgrade 
2. rhevm-upgrade requests to stop the following tasks:
tasks given by the sql:
psql -U engine -c 'select command_type,entity_type from business_entity_snapshot';

 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network
 org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map

(a) I say YES. 

(b) and wait ~minutes while rhevm-upgrade tries to clear these tasks. 

(c) rhevm-upgrade fails to stop the tasks and again says that there are tasks need to be stopped.

(d) back to (a) [infinite loop]. 

Expected results:
Stop and exit with appropriate message.

Additional info:
i manually truncated business_entity_snapshot;
then ran rhevm-upgrade successfully.

another info: I have no idea from where these tasks come from, but i guess maybe they are there because i run automatic tests on the machine.

Comment 1 Tareq Alayan 2012-11-06 16:50:59 UTC
Created attachment 639447 [details]
upgrade_log

Comment 2 Simon Grinberg 2012-11-08 11:02:39 UTC
Tareq, this is the design, to break the loop and exit with appropriate error message you should answer no, did you try that and it failed? 

Looking at your log it looks like it. So the problem is that selecting no does not exit in a clean manner and an error message sending the customer to get support from Red Hat.

 Nov 06 17:32:48 ] Would you like to proceed and try to stop tasks automatically?
(Answering 'no' will stop the upgrade)? (yes|no): 
2012-11-06 17:52:26::ERROR::rhevm-upgrade::1001::root:: Traceback (most recent call last):
  File "/usr/bin/rhevm-upgrade", line 986, in checkRunningTasks
    answerYes = utils.askYesNo(stopTasksQuestion)
  File "/usr/share/ovirt-engine/scripts/common_utils.py", line 867, in askYesNo
    rawAnswer = raw_input(message.read())
EOFError

2012-11-06 17:52:26::DEBUG::rhevm-upgrade::1007::root:: Stopping ovirt-engine service...

Comment 3 Tareq Alayan 2012-11-12 17:14:32 UTC
when i said no, it broke the loop and exit.

Comment 6 Tareq Alayan 2012-11-13 11:48:55 UTC
Created attachment 644069 [details]
vdsm.log.28.xz from 6.Nov

Comment 7 Tareq Alayan 2012-11-13 11:49:53 UTC
Created attachment 644070 [details]
vdsm.log.29.x from 6.Nov

Comment 8 Tareq Alayan 2012-11-13 11:51:34 UTC
Created attachment 644071 [details]
engine.log

Comment 13 Itamar Heim 2012-11-19 10:48:55 UTC
closing on this issue.
simon, if you want to change the exit message, please open a new bug specifically for this.