Bug 1379820 - invalid state transition stopped >> paused on foreman-tasks restart
Summary: invalid state transition stopped >> paused on foreman-tasks restart
Keywords:
Status: CLOSED DUPLICATE of bug 1390933
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Tasks Plugin
Version: 6.2.0
Hardware: Unspecified
OS: Unspecified
high
high vote
Target Milestone: Unspecified
Assignee: Ivan Necas
QA Contact: Renzo Nuccitelli
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-27 18:08 UTC by Ivan Necas
Modified: 2019-12-16 06:55 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-01 22:37:26 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Ivan Necas 2016-09-27 18:08:48 UTC
Description of problem:
At some occasions, the task data can get into a state that prevent the system to recover properly from inconsistent state.

How reproducible:
Rarely

Steps to Reproduce:
1. kill -9 dynflow executor while some task is running
2. foreman-tasks restart

Actual results:
the  task still acting as being run by killed executor

Expected results:
the task marked as paused on running by new executor

Comment 1 Ivan Necas 2016-09-27 18:11:28 UTC
Bug fixed upstream with https://github.com/Dynflow/dynflow/pull/197

In case the system is in the invalid state, the foreman-tasks service might need to be restarted twice after applying the fix
           
service foreman-tasks restart
echo "we need to restart it twice to converge to better state, waiting a minute for tasks to start"
sleep 60
service foreman-tasks restart

Comment 7 Adam Ruzicka 2016-11-28 13:42:07 UTC
How reproducible:

It is easier to reproduce this on a weaker machine (1 or 2 cores, 4GBs of ram seems reasonable for this)

1) Start a long running task (e.g. repository synchronization for a new repository)

2) ASAP run the following command to kill all foreman-tasks processes. The key to reproducing this issue is killing the dynflow executor when the task is in its run phase

for pid in $(ps -eo pid,args | grep dynflow_executor | grep -v grep | awk ' { print $1 } '); do
    kill -9 $pid
done

2) Navigate to the details of the task, go to the raw tab and copy the external id

3) Run the following command to put the task into a wrong state, replace << EXTERNAL_TASK_ID_HERE >> with the id retrieved in step 2

export EXTERNAL_TASK_ID="<< EXTERNAL_TASK_ID_HERE >>"
foreman-rake console <<END
task_id="$EXTERNAL_TASK_ID"
plan = ForemanTasks.dynflow.world.persistence.load_execution_plan(task_id)
plan.state = :stopped
plan.save
END

4) Restart the foreman-tasks service

systemctl restart foreman-tasks

4) Watch production.log for lines looking like

[foreman-tasks/dynflow] [E] invalid worlds found {"d18522e8-943e-4486-8ca4-38befa405d30"=>"invalid state transition stopped >> paused in #<Dynflow::ExecutionPlan:0x000
00004a16fb8>"

Note: The "invalid worlds found" error will show up even when using the patched version, but all the values will be either :valid or :invalidated and won't mention invalid state transition

Comment 9 Ivan Necas 2016-12-01 22:37:26 UTC
Yes, as the fix for 1390933  should cover both cases

*** This bug has been marked as a duplicate of bug 1390933 ***


Note You need to log in before you can comment on or make changes to this bug.