Bug 1196136 - Engine-setup should support cleaning of zombie commands before upgrade
Summary: Engine-setup should support cleaning of zombie commands before upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-setup
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ---
: 3.5.1
Assignee: Simone Tiraboschi
QA Contact: Petr Kubica
URL:
Whiteboard: integration
Depends On:
Blocks: oVirt_3.5.2_tracker 1193058 1197441
TreeView+ depends on / blocked
 
Reported: 2015-02-25 11:06 UTC by rhev-integ
Modified: 2022-07-09 07:08 UTC (History)
16 users (show)

Fixed In Version: org.ovirt.engine-root-3.5.1-3
Doc Type: Bug Fix
Doc Text:
With this update, zombie commands are cleaned to avoid getting stuck waiting for task and commands completion.
Clone Of: 1164771
Environment:
Last Closed: 2015-04-28 18:48:44 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:
ylavi: Triaged+


Attachments (Terms of Use)
latest_av (872.98 KB, application/x-gzip)
2015-04-07 07:57 UTC, Petr Kubica
no flags Details
vt13.11 (443.06 KB, application/x-gzip)
2015-04-07 07:57 UTC, Petr Kubica
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0888 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Virtualization Manager 3.5.1 update 2015-04-28 22:40:04 UTC
oVirt gerrit 36302 0 master MERGED packaging: setup: clearing only zombie tasks Never
oVirt gerrit 37847 0 ovirt-engine-3.5 MERGED setup: checking if command_entities table exist Never
oVirt gerrit 38147 0 master MERGED setup: Splitting taskcleaner_sp into two scripts Never
oVirt gerrit 38157 0 ovirt-engine-3.5.2 MERGED setup: checking if command_entities table exist Never
oVirt gerrit 38158 0 ovirt-engine-3.5 MERGED packaging: setup: clearing only zombie tasks Never
oVirt gerrit 38159 0 ovirt-engine-3.5.2 MERGED packaging: setup: clearing only zombie tasks Never
oVirt gerrit 38895 0 None None None Never

Comment 2 Petr Kubica 2015-03-25 08:51:17 UTC
Hi
Please provide steps how to test it. Users will use manually this utility before upgrade or is used automatically during upgrade ?

Thanks 
Petr

Comment 3 Simone Tiraboschi 2015-03-25 09:24:55 UTC
The utility will be automatically called by engine-setup during upgrade.

How to test it:
- install the engine
- create a zombie (start creating a template and than suddenly disconnect the involved host)
- launch engine-setup again to upgrade

Comment 4 Petr Kubica 2015-03-25 13:14:19 UTC
I got error 201 while doing upgrade from vt13.15 to vt14.1 with zombie
It looks like the same problem at https://bugzilla.redhat.com/show_bug.cgi?id=1161012

          Please confirm installation settings (OK, Cancel) [OK]: 
[ INFO  ] Cleaning async tasks and compensations
[ ERROR ] Failed to execute stage 'Setup validation': 201
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20150325133936-qk8a4w.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20150325134011-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed

In log I found this:
2015-03-25 13:40:11 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.upgrade.asynctasks asynctasks._validateAsyncTasks:457 Cleaning async tasks and compensations
2015-03-25 13:40:11 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:164 Database: 'None', Statement: '
                select
                async_tasks.action_type,
                async_tasks.task_id,
                async_tasks.started_at,
                storage_pool.name
                from async_tasks, storage_pool
                where async_tasks.storage_pool_id = storage_pool.id
            ', args: {}
2015-03-25 13:40:11 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:169 Creating own connection
2015-03-25 13:40:11 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:214 Result: [{'started_at': datetime.datetime(2015, 3, 25, 13, 35, 26, 206000, tzinfo=<psycopg2.tz.FixedOffsetTimezone object at 0x3414710>), 'task_id': 'c9473a00-5bb2-4a2b-81a3-4ae04131502d', 'action_type': 201, 'name': 'Default'}]
2015-03-25 13:40:11 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 464, in _validateAsyncTasks
    ) = self._checkRunningTasks()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 357, in _checkRunningTasks
    self._getRunningTasks(dbstatement),
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 218, in _getRunningTasks
    for entry in tasks
KeyError: 201
2015-03-25 13:40:11 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Setup validation': 201

attached logs from engine.

Comment 6 Simone Tiraboschi 2015-03-25 16:22:40 UTC
Yes, it's really the same.
It has already been addressed by https://gerrit.ovirt.org/#/c/38895/

Comment 7 Petr Kubica 2015-04-07 07:56:52 UTC
I tried to verificate this bug during the upgrade from latest_av and from vt13.11.
I created a zombie by comment #12 (template). While I did the upgrade of the engine from latest_av to vt14.2: 

          Please confirm installation settings (OK, Cancel) [OK]: 
[ INFO  ] Cleaning async tasks and compensations
          The following system tasks have been found running in the system:
          Task ID:           5d58e691-da49-422e-b27e-122adebf6e38
          Task Name:         AddVmTemplateCommand          
          Task Description:  Adding a template             
          Started at:        30
          DC Name:           Default                       
[ ERROR ] Failed to execute stage 'Setup validation': 
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20150407092843-wk5tdd.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20150407092913-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed

In log:
2015-04-07 09:29:13 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 454, in _validateAsyncTasks
    compensations,
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 305, in _askUserToWaitForTasks
    commands='\n'.join(runningCommands),
TypeError
2015-04-07 09:29:13 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Setup validation': 

While I did the upgrade of the engine from vt13.11 to vt14.2, behaviour was a little different: 

          Please confirm installation settings (OK, Cancel) [OK]: 
[ INFO  ] Cleaning async tasks and compensations
          The following system tasks have been found running in the system:
          Task ID:           cd472cc8-2dff-47d8-b54a-6f6e829898f3
          Task Name:         AddVmTemplateCommand          
          Task Description:  Adding a template             
          Started at:        30
          DC Name:           Default                       
          The following commands have been found running in the system:
          The following compensations have been found running in the system:
          Would you like to try to wait for that?
          (Answering "no" will stop the upgrade (Yes, No)

But, it's a zombie and I have only two options: 
- stop the upgrade
- forever wait

I thing it doesn't recognize that it is a zombie.

logs from both "latest_av" and "vt13.11" attached

Comment 8 Petr Kubica 2015-04-07 07:57:35 UTC
Created attachment 1011639 [details]
latest_av

Comment 9 Petr Kubica 2015-04-07 07:57:53 UTC
Created attachment 1011640 [details]
vt13.11

Comment 10 Sandro Bonazzola 2015-04-07 08:10:55 UTC
Please test with VT14.2. VT13.11 is known to be affected.

Comment 11 Petr Kubica 2015-04-07 08:29:27 UTC
I tried two upgrade latest_av >> vt14.2 and vt13.11 >> vt14.2. So moving to ASSIGNED

Comment 12 Simone Tiraboschi 2015-04-07 08:56:52 UTC
A running task should be identified as zombie only after a certain timeout defined by AsyncTaskZombieTaskLifeInMinutes vdcoption.
Petr, could you retry waiting the required time to let it identify the task as a zombie?

Comment 13 Petr Kubica 2015-04-07 09:23:41 UTC
Verified with template zombie in vt14.2 (rhevm 3.5.1-0.3.el6ev) 

For testing I changed the value of AsyncTaskZombieTaskLifeInMinutes in table vdc_options from 3000 to 30 minutes.

Comment 14 errata-xmlrpc 2015-04-28 18:48:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0888.html


Note You need to log in before you can comment on or make changes to this bug.