Bug 1161012 - task cleaning utility should erase commands that have running tasks
Summary: task cleaning utility should erase commands that have running tasks
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-core
Version: 3.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.5.2
Assignee: Simone Tiraboschi
QA Contact: Petr Kubica
URL:
Whiteboard: integration
: 1190228 (view as bug list)
Depends On:
Blocks: 1164771 1180867 oVirt_3.5.2_tracker 1193058 1196662 1197441
TreeView+ depends on / blocked
 
Reported: 2014-11-06 07:19 UTC by Yair Zaslavsky
Modified: 2015-04-29 06:15 UTC (History)
16 users (show)

Fixed In Version: org.ovirt.engine-root-3.5.1-3
Doc Type: Bug Fix
Doc Text:
With oVirt 3.5 we also have tables for async commands. The task cleaning utility should not only erase tasks, but should also erase commands that are associated with tasks.
Clone Of:
Environment:
Last Closed: 2015-04-29 06:15:45 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)
logs from engine (320.55 KB, application/x-gzip)
2015-03-11 12:59 UTC, Petr Kubica
no flags Details
latest_av (872.98 KB, application/x-gzip)
2015-04-07 07:59 UTC, Petr Kubica
no flags Details
vt13.11 (443.06 KB, application/x-gzip)
2015-04-07 08:00 UTC, Petr Kubica
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 34874 0 master MERGED core: Add foreign keys from async_tasks to command_entities Never
oVirt gerrit 34927 0 master MERGED setup: Changing task cleaner utility to also handle removal of commands Never
oVirt gerrit 35219 0 ovirt-engine-3.5 MERGED core: Add foreign keys from async_tasks to command_entities Never
oVirt gerrit 35220 0 ovirt-engine-3.5 MERGED setup: Changing task cleaner utility to also handle removal of commands Never
oVirt gerrit 36057 0 master MERGED setup: Changing task cleaner utility to also handle removal of commands Never
oVirt gerrit 36880 0 ovirt-engine-3.5 MERGED setup: Changing task cleaner utility to also handle removal of commands Never
oVirt gerrit 37421 0 master MERGED engine: Adding a helper function to check if table exists Never
oVirt gerrit 37422 0 master MERGED setup: checking if command_entities table exist Never
oVirt gerrit 37846 0 ovirt-engine-3.5 MERGED engine: Adding a helper function to check if table exists Never
oVirt gerrit 37847 0 ovirt-engine-3.5 MERGED setup: checking if command_entities table exist Never
oVirt gerrit 38147 0 master MERGED setup: Splitting taskcleaner_sp into two scripts Never
oVirt gerrit 38157 0 ovirt-engine-3.5.2 MERGED setup: checking if command_entities table exist Never
oVirt gerrit 38583 0 master MERGED packaging: setup: fixing action_type type Never
oVirt gerrit 38596 0 master MERGED packaging: setup: using the right type to detect zombie tasks Never
oVirt gerrit 38602 0 master MERGED packaging: setup: avoid setting AsyncTaskZombieTaskLifeInMinutes to 0 Never
oVirt gerrit 38846 0 ovirt-engine-3.5 MERGED packaging: setup: using the right type to detect zombie tasks Never
oVirt gerrit 38847 0 ovirt-engine-3.5.2 MERGED packaging: setup: using the right type to detect zombie tasks Never
oVirt gerrit 38848 0 ovirt-engine-3.5 ABANDONED packaging: setup: avoid setting AsyncTaskZombieTaskLifeInMinutes to 0 Never
oVirt gerrit 38849 0 ovirt-engine-3.5.2 ABANDONED packaging: setup: avoid setting AsyncTaskZombieTaskLifeInMinutes to 0 Never
oVirt gerrit 38894 0 master MERGED packaging: setup: avoid setting AsyncTaskZombieTaskLifeInMinutes to 0 Never
oVirt gerrit 38895 0 ovirt-engine-3.5 MERGED packaging: setup: fixing action_type type Never
oVirt gerrit 38896 0 ovirt-engine-3.5.2 MERGED packaging: setup: fixing action_type type Never
oVirt gerrit 38987 0 ovirt-engine-3.5 MERGED packaging: setup: avoid setting AsyncTaskZombieTaskLifeInMinutes to 0 Never
oVirt gerrit 38991 0 ovirt-engine-3.5.2 MERGED packaging: setup: avoid setting AsyncTaskZombieTaskLifeInMinutes to 0 Never

Description Yair Zaslavsky 2014-11-06 07:19:02 UTC
Description of problem:


The task cleaning utility should not only erase tasks, but should also erase commands that are associated with tasks.



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Yedidyah Bar David 2014-11-25 13:31:52 UTC
Moving back to assigned, as http://gerrit.ovirt.org/35220 broke upgrade from 3.4. see bug 1167769.

Comment 2 Yedidyah Bar David 2014-11-25 13:40:53 UTC
To know which version you upgrade from, you can use self.environment[osetupcons.CoreEnv.ORIGINAL_GENERATED_BY_VERSION] - should contain a string, e.g. '3.4.4' or '3.5.0'.

Comment 3 Yedidyah Bar David 2014-11-25 13:43:31 UTC
but perhaps better to condition this on existence of the table 'command_entities'.

Comment 4 Yair Zaslavsky 2014-12-09 02:23:23 UTC
(In reply to Yedidyah Bar David from comment #3)
> but perhaps better to condition this on existence of the table
> 'command_entities'.

Indeed, and this is what I plan to do.
Thanks!

Comment 5 Sandro Bonazzola 2015-01-21 16:12:48 UTC
oVirt 3.5.1 has been released and since this bug is targeted 3.5.1 and in modified state, it should be included in this release.
Please re-target and move nack to modified if this assumption is not valid for this bug.

Comment 6 Simone Tiraboschi 2015-02-09 12:32:00 UTC
*** Bug 1190228 has been marked as a duplicate of this bug. ***

Comment 7 Eyal Edri 2015-02-10 16:53:29 UTC
any update on this bug? it's currently failing a ci job:
http://jenkins.ovirt.org/job/ovirt-engine_3.5_upgrade-from-3.4_merged/

Comment 8 Eyal Edri 2015-02-26 12:31:22 UTC
this ovirt bug was fixed during 3.5.1 cycle and is included in the build, and therefore should be verified.

Comment 11 Simone Tiraboschi 2015-03-10 17:07:50 UTC
*** Bug 1200454 has been marked as a duplicate of this bug. ***

Comment 12 Petr Kubica 2015-03-11 12:54:05 UTC
I installed version of oVirt: ovirt-engine-3.5.1.1-1.el6.noarch (there isn't patch yet)
I sucessfully added a Host, Storage domain and created 1 VM
After I started making a template from VM, I turned down the network of the Host to create a zombie task.
I removed the host from engine for sure.
Then I want to upgrade the engine to version
ovirt-engine-setup-3.5.2-0.0.master.20150226114525.el6.noarch.rpm (Patched)
but when making the setup of the engine I got error:

        Please confirm installation settings (OK, Cancel) [OK]: 
[ INFO  ] Cleaning async tasks and compensations
[ ERROR ] Failed to execute stage 'Setup validation': 201
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20150311132601-nkrll6.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20150311132820-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed

In log is this error:

2015-03-11 13:28:20 DEBUG otopi.context context._executeMethod:138 Stage validation METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.upgrade.asynctasks.Plugin._validateAsyncTasks
2015-03-11 13:28:20 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.upgrade.asynctasks asynctasks._validateAsyncTasks:457 Cleaning async tasks and compensations
2015-03-11 13:28:20 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:164 Database: 'None', Statement: '
                select
                async_tasks.action_type,
                async_tasks.task_id,
                async_tasks.started_at,
                storage_pool.name
                from async_tasks, storage_pool
                where async_tasks.storage_pool_id = storage_pool.id
            ', args: {}
2015-03-11 13:28:20 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:169 Creating own connection
2015-03-11 13:28:20 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:214 Result: [{'started_at': datetime.datetime(2015, 3, 11, 13, 11, 2, 801000, tzinfo=<psycopg2.tz.FixedOffsetTimezone obje
ct at 0x1703750>), 'task_id': '8e43abf3-75a1-411e-b5ab-d7df01eebb1b', 'action_type': 201, 'name': 'Default'}]
2015-03-11 13:28:20 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 464, in _validateAsyncTasks
    ) = self._checkRunningTasks()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 357, in _checkRunningTasks
    self._getRunningTasks(dbstatement),
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 218, in _getRunningTasks
    for entry in tasks
KeyError: 201
2015-03-11 13:28:20 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Setup validation': 201

attached logs from engine.

Comment 13 Petr Kubica 2015-03-11 12:59:50 UTC
Created attachment 1000417 [details]
logs from engine

Comment 14 Simone Tiraboschi 2015-03-12 12:00:54 UTC
We found other issues handling AsyncTaskZombieTaskLifeInMinutes to identify the zombies. Now it should work.

Comment 15 Oved Ourfali 2015-03-19 13:38:44 UTC
Moving to integration, and assigning to you, Simone.

Comment 16 Yaniv Lavi 2015-03-24 13:16:39 UTC
Oved, can you please exact flow that you want QE to test here?
There is a lot of code changes related to this bug and we do not want to miss regressions on this.

Comment 17 Oved Ourfali 2015-03-24 13:22:29 UTC
(In reply to Yaniv Dary from comment #16)
> Oved, can you please exact flow that you want QE to test here?
> There is a lot of code changes related to this bug and we do not want to
> miss regressions on this.

Exactly what Petr already did when he tried to verify it in the past. See Comment 12.

Comment 18 Yaniv Lavi 2015-03-24 16:12:05 UTC
(In reply to Oved Ourfali from comment #17)
> (In reply to Yaniv Dary from comment #16)
> > Oved, can you please exact flow that you want QE to test here?
> > There is a lot of code changes related to this bug and we do not want to
> > miss regressions on this.
> 
> Exactly what Petr already did when he tried to verify it in the past. See
> Comment 12.

I guess this change goes deeper than just that flow. Please provide list of affected areas to test that changed.

Comment 19 Oved Ourfali 2015-03-24 18:05:17 UTC
It doesn't go deeper. Petr can contact us if he needs to.

Comment 20 Petr Kubica 2015-04-07 07:58:51 UTC
I tried to verificate this bug during the upgrade from latest_av and from vt13.11.
I created a zombie by comment #12 (template). While I did the upgrade of the engine from latest_av to vt14.2: 

          Please confirm installation settings (OK, Cancel) [OK]: 
[ INFO  ] Cleaning async tasks and compensations
          The following system tasks have been found running in the system:
          Task ID:           5d58e691-da49-422e-b27e-122adebf6e38
          Task Name:         AddVmTemplateCommand          
          Task Description:  Adding a template             
          Started at:        30
          DC Name:           Default                       
[ ERROR ] Failed to execute stage 'Setup validation': 
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20150407092843-wk5tdd.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20150407092913-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed

In log:
2015-04-07 09:29:13 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 454, in _validateAsyncTasks
    compensations,
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/upgrade/asynctasks.py", line 305, in _askUserToWaitForTasks
    commands='\n'.join(runningCommands),
TypeError
2015-04-07 09:29:13 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Setup validation': 

While I did the upgrade of the engine from vt13.11 to vt14.2, behaviour was a little different: 

          Please confirm installation settings (OK, Cancel) [OK]: 
[ INFO  ] Cleaning async tasks and compensations
          The following system tasks have been found running in the system:
          Task ID:           cd472cc8-2dff-47d8-b54a-6f6e829898f3
          Task Name:         AddVmTemplateCommand          
          Task Description:  Adding a template             
          Started at:        30
          DC Name:           Default                       
          The following commands have been found running in the system:
          The following compensations have been found running in the system:
          Would you like to try to wait for that?
          (Answering "no" will stop the upgrade (Yes, No)

But, it's a zombie and I have only two options: 
- stop the upgrade
- forever wait

I thing it doesn't recognize that it is a zombie.

logs from both "latest_av" and "vt13.11" attached

Comment 21 Petr Kubica 2015-04-07 07:59:18 UTC
Created attachment 1011641 [details]
latest_av

Comment 22 Petr Kubica 2015-04-07 08:00:04 UTC
Created attachment 1011642 [details]
vt13.11

Comment 23 Sandro Bonazzola 2015-04-07 08:08:36 UTC
Please, verify on VT14.2. VT 13.11 is known to be affected.

Comment 24 Petr Kubica 2015-04-07 08:30:13 UTC
I tried two upgrade: latest_av >> vt14.2 and vt13.11 >> vt14.2. So moving to ASSIGNED

Comment 25 Petr Kubica 2015-04-07 09:22:47 UTC
Verified with template zombie in vt14.2 (rhevm 3.5.1-0.3.el6ev) 

There is the timeout after that the tasks is marked as zombie.
For testing I changed the value of AsyncTaskZombieTaskLifeInMinutes in table vdc_options from 3000 to 30 minutes.

Comment 26 Eyal Edri 2015-04-29 06:15:45 UTC
ovirt 3.5.2 was GA'd. closing current release.


Note You need to log in before you can comment on or make changes to this bug.