Description of problem: `foreman-rake foreman_tasks:cleanup ...` goes on and on for orphaned task locks Version-Release number of selected component (if applicable): tfm-rubygem-foreman_remote_execution-1.6.7-8.el7sat.noarch How reproducible: always Steps to Reproduce: 1. # foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='1000h' VERBOSE=true Actual results: It says it is going to delete "About to remove 3517 orphaned task locks", but after 4 batches of 1000, it keeps doing something. Also when I kill it with "Ctrl+C", it still continues on the background (see below). Expected results: It should do only what it needs to do and Ctrl+C should be able to kill it completely Additional info: [root@satellite ~]# foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='1000h' VERBOSE=true /usr/share/foreman/lib/foreman.rb:8: warning: already initialized constant Foreman::UUID_REGEXP /usr/share/foreman/lib/foreman.rb:8: warning: previous definition of UUID_REGEXP was here API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache. /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/content_facet_host_extensions.rb:7: warning: already initialized constant Katello::Concerns::ContentFacetHostExtensions::ERRATA_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/content_facet_host_extensions.rb:7: warning: previous definition of ERRATA_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/content_facet_host_extensions.rb:14: warning: already initialized constant Katello::Concerns::ContentFacetHostExtensions::TRACE_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/content_facet_host_extensions.rb:14: warning: previous definition of TRACE_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:13: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::SUBSCRIPTION_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:13: warning: previous definition of SUBSCRIPTION_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:21: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::SLA_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:21: warning: previous definition of SLA_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:26: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::USAGE_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:26: warning: previous definition of USAGE_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:31: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::ROLE_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:31: warning: previous definition of ROLE_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:36: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::ADDONS_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:36: warning: previous definition of ADDONS_STATUS_MAP was here /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:41: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::PURPOSE_STATUS_MAP /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.10.0.21/app/models/katello/concerns/subscription_facet_host_extensions.rb:41: warning: previous definition of PURPOSE_STATUS_MAP was here No tasks matching filter found, skipping. About to remove 3517 orphaned task locks 0/3517 1000/3517 2000/3517 3000/3517 4000/3517 5000/3517 6000/3517 7000/3517 8000/3517 9000/3517 10000/3517 11000/3517 12000/3517 13000/3517 14000/3517 15000/3517 16000/3517 17000/3517 18000/3517 19000/3517 20000/3517 ^C Session terminated, killing shell... ...killed. [root@satellite ~]# 21000/3517 22000/3517 23000/3517 24000/3517 25000/3517 26000/3517 27000/3517 ...
I've never seen anyone use the '1000h' method for calculating time window for cleaning up tasks. Everyone just uses days. I ran it on your Satellite with the 780,000 tasks with a 7 day window and got: # foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='7d' VERBOSE=true [root@satellite ~]# foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='7d' VERBOSE=true About to remove 738041 tasks matching filter 0/738041 1000/738041 2000/738041 3000/738041 4000/738041 5000/738041 6000/738041 7000/738041 8000/738041 ....738000/738041 738041/738041 Deleted 738041 tasks matching filter About to remove 2526939 orphaned task locks 0/2526939 1000/2526939 2000/2526939 3000/2526939 4000/2526939 5000/2526939 6000/2526939 .... 2525000/2526939 2526000/2526939 2526939/2526939 Deleted 2526939 orphaned task locks About to remove 1002 orphaned execution plans 0/1002 1000/1002 1002/1002 Deleted 1002 orphaned execution plans No oprhaned job invocations found, skipping. [root@satellite ~]# worked just fine for me.
Oh, I see what you did, you put the time window *past* any tasks that existed. What you are telling the cleanup to do is only delete tasks that are *OLDER* than 1000h (~41 days), not all tasks up to 1000h. Since your Satellite was new, there were no tasks older than 41days but it tried to delete orphan task locks and got confused. This is still a bug, but not as severe or should block 6.5 IMO because the use case isn't common.
Actually, I'm going to revise my comment above. This should still block 6.5 because if anyone sets up the automatic cron based cleanup configuration on a freshly installed 6.5 server, the rake task will run forever because it will be deleting tasks older than 30 days even when none exist. Flagging as a regression as this does not occur on 6.4 or 6.3
Thank you for your investigation! comment #3: Yes, I know I can use days, but used hours, not sure why :) comment #4 and #5: I was preparing automated job to do so and first wanted to try running it so it wont delete anything without that dry run mode.
I have tried again (sat updated to snap 20) and it is taking more than 2 hours without any deleting which sounds wrong: [root@satellite ~]# time foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='10d' VERBOSE=true No tasks matching filter found, skipping. No orphaned task locks found, skipping. No orphaned execution plans found, skipping. No oprhaned job invocations found, skipping. real 137m42.629s user 0m29.073s sys 0m3.902s
So I have ran: # katello-service stop # systemctl start postgresql # su - postgres $ vacuumdb --all --verbose --echo and again, foreman_tasks:cleanup took almost 3 hours. I'm now trying with: $ vacuumdb --all --full --analyze Note to myself - is this optimal way to get count? 2019-03-17 18:28:25 EDT LOG: duration: 1039.402 ms statement: SELECT COUNT(*) FROM (SELECT 1 AS one FROM "foreman_tasks_locks" LEFT OUTER JOIN "foreman_tasks_tasks" ON "foreman_tasks_tasks"."id" = "foreman_tasks_locks"."task_id" WHERE "foreman_tasks_tasks"."id" IS NULL LIMIT 1000) subquery_for_count;
Done full vacuum (DB dir size changed from 16 GB to 4.7 GB) and reran tasks cleanup and again, it seems quite slow. After 30min, there is this and it is still running: [root@satellite ~]# time foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='10d' VERBOSE=true About to remove 6460 tasks matching filter 0/6460 1000/6460 2000/6460 3000/6460 4000/6460 5000/6460 6000/6460 6460/6460 Deleted 6460 tasks matching filter About to remove 11298 orphaned task locks 0/11298 1000/11298 2000/11298 3000/11298 4000/11298 5000/11298 6000/11298 7000/11298 8000/11298 9000/11298 10000/11298 11000/11298 11298/11298 Deleted 11298 orphaned task locks
The performance issue seems to be related to this BZ https://bugzilla.redhat.com/show_bug.cgi?id=155706 and particularly related to the fact that there is too many orphaned execution plans in the reproducing system. I was not able to reproduce the issue https://bugzilla.redhat.com/show_bug.cgi?id=1684051#c5, mike were you able to reproduce on your setup?
I've found the reason why the cleanup job was running slowly: seems like a regression of https://bugzilla.redhat.com/show_bug.cgi?id=1557067 when there is large amount of tasks, the query for the orhpaned execution plans goes forever. Will send a patch for optimized query. I was not able to reproduce the issue with cleanup older than the satellite. Jan: 10.16.31.67 is patched to fix the perf issue: could you try reporducing the other stuff?
Created redmine issue https://projects.theforeman.org/issues/26511 from this bug
I have tried on different satellite (which is installed about 2 weeks back - i.e. 336 hours ago) and currently is snap #22 and was not able to reproduce: [root@satellite ~]# foreman-rake foreman_tasks:cleanup TASK_SEARCH='label ~ *' AFTER='1000h' VERBOSE=true No tasks matching filter found, skipping. No orphaned task locks found, skipping. No orphaned execution plans found, skipping. No oprhaned job invocations found, skipping. So I think we can stop tracking that issue.
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/26511 has been resolved.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:1222