Description of problem: running rhevm upgrade from si26.4 to si27.4 is failing while there are leftovers in the async tasks table, engine tries to stop them, wait for 3 minutes then prompt the following screen again (in a loop): [ Mar 13 18:36:11 ] Would you like to proceed and try to stop tasks automatically? attached full upgrade log. root@rhevm-3 yum.repos.d]# rhevm-upgrade Loaded plugins: product-id, rhnplugin This system is receiving updates from RHN Classic or RHN Satellite. Checking for updates... (This may take several minutes) 11 Updates available: * rhevm-3.1.0-50.el6ev.noarch * rhevm-backend-3.1.0-50.el6ev.noarch * rhevm-config-3.1.0-50.el6ev.noarch * rhevm-dbscripts-3.1.0-50.el6ev.noarch * rhevm-genericapi-3.1.0-50.el6ev.noarch * rhevm-notification-service-3.1.0-50.el6ev.noarch * rhevm-restapi-3.1.0-50.el6ev.noarch * rhevm-tools-common-3.1.0-50.el6ev.noarch * rhevm-userportal-3.1.0-50.el6ev.noarch * rhevm-webadmin-portal-3.1.0-50.el6ev.noarch * vdsm-bootstrap-4.10.2-1.8.el6ev.noarch During the upgrade process, RHEV Manager will not be accessible. All existing running virtual machines will continue but you will not be able to start or stop any new virtual machines during the process. Would you like to proceed? (yes|no): yes Stopping ovirt-engine service... [ DONE ] Stopping DB related services... [ DONE ] Cleaning async tasks... [ DONE ] Info: The following tasks have been found running in the system: System Tasks: command_type | entity_type ------------------------------------------------------------+--------------------------------------------------------------------- org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static (4 rows) [ Mar 13 18:36:11 ] Would you like to proceed and try to stop tasks automatically? (Answering 'no' will stop the upgrade)? (yes|no): yes [ Mar 13 18:37:06 ] System will try to clear tasks during the next 3 minutes. Info: The following tasks have been found running in the system: System Tasks: command_type | entity_type ------------------------------------------------------------+--------------------------------------------------------------------- org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static (4 rows) [ Mar 13 18:40:15 ] Would you like to proceed and try to stop tasks automatically? (Answering 'no' will stop the upgrade)? (yes|no): no Starting DB related services... [ DONE ] Starting ovirt-engine... [ DONE ] There are still running tasks: System Tasks: command_type | entity_type ------------------------------------------------------------+--------------------------------------------------------------------- org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.Network org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_pool_iso_map org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand | org.ovirt.engine.core.common.businessentities.storage_domain_static (4 rows) Please make sure that there are no running system tasks before you continue. Please contact GSS for assistance. Stopping upgrade. Error: Upgrade failed. please check log at /var/log/ovirt-engine/ovirt-engine-upgrade_2013_03_13_18_33_42.log
Created attachment 709718 [details] ovirt engine upgrade log
taskcleaner has a -C flag to clean compensation data, is the installer using it : taskcleaner.sh [-h] [-s server] [-p PORT]] [-d DATABASE] [-u USERNAME] [-l LOGFILE] [-t taskId] [-c commandId] [-z] [-R] [-C] [-J] [-q] [-v] -s SERVERNAME - The database servername for the database (def. localhost) -p PORT - The database port for the database (def. 5432) -d DATABASE - The database name (def. engine) -u USERNAME - The admin username for the database. -l LOGFILE - The logfile for capturing output (def. taskcleaner.sh.log) -t TASK_ID - Removes a task by its Task ID. -c COMMAND_ID - Removes all tasks related to the given Command Id. -z - Removes/Displays a Zombie task. -R - Removes all Zombie tasks. -C - Clear related compensation entries. -J - Clear related Job Steps. -q - Quite mode, do not prompt for confirmation. -v - Turn on verbosity (WARNING: lots of output) -h - This help text.
(In reply to comment #4) > taskcleaner has a -C flag to clean compensation data, is the installer using > it : > Yes. We run the upgrade with the -zRCJq flags.
adding -A flag in bug 921202 fix Please use in the installed -zAJq instead of -zRCJq
verified on sf17 before upgrade I had entried in the buisness_entity_snapshot and after upgrade it was cleared before: engine=# SELECT * from business_entity_snapshot ; id | command_id | command_type | entity_id | entity_type | entity_snapshot | snapshot_class | snapshot_type | insertion_order | started_at --------------------------------------+--------------------------------------+------------------------------------------------------------------+------------------------------------------------------------------------------------------- -------------+---------------------------------------------------------------------+-------------------------------------------------------------------------------------------------+------------------------------------------------------ -------------------------------------+---------------+-----------------+------------------------------- d7e2e524-c154-11e2-8a0c-001a4a169783 | e1cb8d0d-f6ae-4223-aa43-1337e9bbdd8b | org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand | b31db687-35e8-403c-b61a-ac55bc09d7f8 | org.ovirt.engine.core.common.businessentities.storage_domain_static | { | org.ovirt.engine.core.common.businessentities.storage _domain_static | 0 | 2 | 2013-05-20 16:54:51.776459+03 : "id" : [ "org.ovirt.engine.core.compat.Guid", { : "uuid" : "b31db687-35e8-403c-b61a-ac55bc09d7f8" : } ], : "storage" : "5jBDV4-MbBY-ZGoz-vOgA-Z26H-fogt-2zwh2e", after: engine=# SELECT * from business_entity_snapshot ; id | command_id | command_type | entity_id | entity_type | entity_snapshot | snapshot_class | snapshot_type | insertion_order | started_at ----+------------+--------------+-----------+-------------+-----------------+----------------+---------------+-----------------+------------ (0 rows) engine=#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0888.html