Hi Please provide steps how to test it. Users will use manually this utility before upgrade or is used automatically during upgrade ? Thanks Petr
The utility will be automatically called by engine-setup during upgrade. How to test it: - install the engine - create a zombie (start creating a template and than suddenly disconnect the involved host) - launch engine-setup again to upgrade
I got error 201 while doing upgrade from vt13.15 to vt14.1 with zombie It looks like the same problem at https://bugzilla.redhat.com/show_bug.cgi?id=1161012 Please confirm installation settings (OK, Cancel) [OK]: [ INFO ] Cleaning async tasks and compensations [ ERROR ] Failed to execute stage 'Setup validation': 201 [ INFO ] Stage: Clean up Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20150325133936-qk8a4w.log [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20150325134011-setup.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Execution of setup failed In log I found this: 2015-03-25 13:40:11 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.upgrade.asynctasks asynctasks._validateAsyncTasks:457 Cleaning async tasks and compensations 2015-03-25 13:40:11 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:164 Database: 'None', Statement: ' select async_tasks.action_type, async_tasks.task_id, async_tasks.started_at, storage_pool.name from async_tasks, storage_pool where async_tasks.storage_pool_id = storage_pool.id ', args: {} 2015-03-25 13:40:11 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:169 Creating own connection 2015-03-25 13:40:11 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:214 Result: [{'started_at': datetime.datetime(2015, 3, 25, 13, 35, 26, 206000, tzinfo=<psycopg2.tz.FixedOffsetTimezone object at 0x3414710>), 'task_id': 'c9473a00-5bb2-4a2b-81a3-4ae04131502d', 'action_type': 201, 'name': 'Default'}] 2015-03-25 13:40:11 DEBUG otopi.context context._executeMethod:152 method exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 464, in _validateAsyncTasks ) = self._checkRunningTasks() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 357, in _checkRunningTasks self._getRunningTasks(dbstatement), File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 218, in _getRunningTasks for entry in tasks KeyError: 201 2015-03-25 13:40:11 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Setup validation': 201 attached logs from engine.
Yes, it's really the same. It has already been addressed by https://gerrit.ovirt.org/#/c/38895/
I tried to verificate this bug during the upgrade from latest_av and from vt13.11. I created a zombie by comment #12 (template). While I did the upgrade of the engine from latest_av to vt14.2: Please confirm installation settings (OK, Cancel) [OK]: [ INFO ] Cleaning async tasks and compensations The following system tasks have been found running in the system: Task ID: 5d58e691-da49-422e-b27e-122adebf6e38 Task Name: AddVmTemplateCommand Task Description: Adding a template Started at: 30 DC Name: Default [ ERROR ] Failed to execute stage 'Setup validation': [ INFO ] Stage: Clean up Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20150407092843-wk5tdd.log [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20150407092913-setup.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Execution of setup failed In log: 2015-04-07 09:29:13 DEBUG otopi.context context._executeMethod:152 method exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 454, in _validateAsyncTasks compensations, File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 305, in _askUserToWaitForTasks commands='\n'.join(runningCommands), TypeError 2015-04-07 09:29:13 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Setup validation': While I did the upgrade of the engine from vt13.11 to vt14.2, behaviour was a little different: Please confirm installation settings (OK, Cancel) [OK]: [ INFO ] Cleaning async tasks and compensations The following system tasks have been found running in the system: Task ID: cd472cc8-2dff-47d8-b54a-6f6e829898f3 Task Name: AddVmTemplateCommand Task Description: Adding a template Started at: 30 DC Name: Default The following commands have been found running in the system: The following compensations have been found running in the system: Would you like to try to wait for that? (Answering "no" will stop the upgrade (Yes, No) But, it's a zombie and I have only two options: - stop the upgrade - forever wait I thing it doesn't recognize that it is a zombie. logs from both "latest_av" and "vt13.11" attached
Created attachment 1011639 [details] latest_av
Created attachment 1011640 [details] vt13.11
Please test with VT14.2. VT13.11 is known to be affected.
I tried two upgrade latest_av >> vt14.2 and vt13.11 >> vt14.2. So moving to ASSIGNED
A running task should be identified as zombie only after a certain timeout defined by AsyncTaskZombieTaskLifeInMinutes vdcoption. Petr, could you retry waiting the required time to let it identify the task as a zombie?
Verified with template zombie in vt14.2 (rhevm 3.5.1-0.3.el6ev) For testing I changed the value of AsyncTaskZombieTaskLifeInMinutes in table vdc_options from 3000 to 30 minutes.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0888.html