Bug 1583894

Summary: Add remove orphan dynflow task in foreman_tasks:cleanup script.
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: Tasks PluginAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3.1CC: aruzicka, inecas
Target Milestone: Unspecified   
Target Release: Unused   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-30 07:27:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hao Chang Yu 2018-05-30 02:16:48 UTC
Description of problem:
Due to the previous misunderstanding, users were deleting the foreman tasks directly from the foreman-rake console using ".destroy" function. This has cause huge performance impact to the Satellite when the orphan dynflow tasks are accumulated to thousands.

I had seen 5 or 6 cases that Satellite suffering from dynflow performance issue due to this issue and I believe there will be more. This is the reason that I decide to file this bug.

High unfinished orphan dynflow tasks usually cause the following issues.
- Host registration timeout.
- Very slow content view publish. Running for a day or more.


In the private note of KCS "https://access.redhat.com/solutions/2755731" has suggested a way to clean up the orphan dynflow tasks. I would like to suggest to add this code into the foreman task clean up script so that the orphan tasks clean up can be run in the cron job. This will prevent a lot of cases to be opened and will automatically fix performance issue for a lot of users.

-----------------
batch_size = 1000
persistence = ForemanTasks.dynflow.world.persistence
adapter = persistence.adapter
plans_without_tasks = adapter.db.fetch("select dynflow_execution_plans.uuid from dynflow_execution_plans left join foreman_tasks_tasks on (dynflow_execution_plans.uuid = foreman_tasks_tasks.external_id) where foreman_tasks_tasks.id IS NULL")
deleted = 0
total = plans_without_tasks.count
plans_without_tasks.all.map{|x| x[:uuid]}.in_groups_of(batch_size, false).each do |uuids|
  delete_count = persistence.delete_execution_plans({ 'uuid' => uuids }, batch_size)
  deleted += delete_count
  puts "deleted #{deleted} out of #{total}"
end
----------------

The performance issue is fixed after deleting the orphan dynflow tasks, such as host registration was previously timeout will finish in seconds.

Comment 1 Adam Ruzicka 2018-05-30 07:27:02 UTC

*** This bug has been marked as a duplicate of bug 1557067 ***