Bug 1687771 - restarting dynflowd with a task in planning phase can leave the task "planning" forever
Summary: restarting dynflowd with a task in planning phase can leave the task "plannin...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Tasks Plugin
Version: 6.4.2
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: 6.7.0
Assignee: satellite6-bugs
QA Contact: Peter Ondrejka
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-12 10:50 UTC by Pavel Moravec
Modified: 2020-04-14 13:24 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-14 13:24:10 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Foreman Issue Tracker 26666 Normal Closed restarting dynflowd with a task in planning phase can leave the task "planning" forever 2020-09-29 14:35:50 UTC
Red Hat Product Errata RHSA-2020:1454 None None None 2020-04-14 13:24:19 UTC

Description Pavel Moravec 2019-03-12 10:50:20 UTC
Description of problem:
There are scenarios (based on a race condition or maybe call flow used for triggering the task?) where a task is planning for a while, and if dynflowd is restarted that time, the task sits "planning" forever.

Particular example (very visible without fix for bz1673447): see reproducer steps.

It is assumed https://github.com/Dynflow/dynflow/pull/303 fixes this.


Version-Release number of selected component (if applicable):
6.4.2 (or anything older)


How reproducible:
very likely (scale the test more to have better chance)


Steps to Reproduce:
1. Have more repos, more LEs and few Capsules

2. Create and publish many CVs with even identical content (one or two small repos, e.g.)

3. Promote many CVs to next LE, e.g. via:

for i in $(seq 1 20); do 
  hammer content-view version promote --content-view CV_${i} --organization-id 1 --from-lifecycle-environment-id 1 --to-lifecycle-environment-id 2 --async &
  sleep 1
done

4. monitor tasks status summary e.g. via:

sudo su - postgres -c "psql -d foreman -c 'select label,count(label),state,result from foreman_tasks_tasks where state <> '\''stopped'\'' group by label,state,result ORDER BY label;'"

5. Once there will be more Actions::Katello::CapsuleContent::Sync tasks in planning state, restart dynflowd:

service dynflowd restart

6. Monitor the tasks status summary until all Capsule Sync tasks terminate


Actual results:
6. is waiting for Godot


Expected results:
6. all Sync tasks successfully complete after a reasonable time


Additional info:

Comment 4 Mike McCune 2019-04-23 21:07:31 UTC
Created redmine issue https://projects.theforeman.org/issues/26666 from this bug

Comment 8 Peter Ondrejka 2020-01-03 15:38:03 UTC
Verified on sat 6.7 snap 7 using reproduction steps from the problem description. The planned taks are cleaned up properly after the service restart.

Comment 11 errata-xmlrpc 2020-04-14 13:24:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1454


Note You need to log in before you can comment on or make changes to this bug.