Red Hat Bugzilla – Bug 1390931
World invalidation can fail, when execution plans are missing
Last modified: 2017-11-30 09:14:25 EST
Description of problem: Under some circumstances (such as manually deleting data from dynflow_execution_plan), Version-Release number of selected component (if applicable): How reproducible: under special circumstances Steps to Reproduce: 1. trigger a task 2. while the task is runnint, delete data from dynflow manually (CAUTION: THIS IS BY NO MEANS A RECOMMENDED WAY OF DEALING WITH TASKS - FOR REPRODUCER PURPOSES ONLY): psql foreman delete from foreman_tasks_tasks; delete from foreman_tasks_locks; delete from dynflow_steps; delete from dynflow_actions; delete from dynflow_execution_plans exit 3. force kill the dynflow executor process 4. restart the foreman-tasks service Actual results: in logs, there is invalid worlds found message, where at the terminated world uuid, there ie "searching: 'execution_plan by: {:uuid=>\"'..." the world /foreman_tasks/dynflow/worlds still shows the world in the list Expected results: dynlfow is able to handle this situation, by skipping the deleted plans
Created redmine issue http://projects.theforeman.org/issues/17177 from this bug
Upstream bug assigned to inecas@redhat.com
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/17177 has been resolved.
*** Bug 1416177 has been marked as a duplicate of this bug. ***
Please add verifications steps for this bug to help QE verify
The verification steps are in https://bugzilla.redhat.com/show_bug.cgi?id=1390931#c0
If this situation happens before it's possible to upgrade to a version that has the fix, one can run: cat <<EOF | foreman-rake console w = ForemanTasks.dynflow.world w.coordinator.find_locks(class: Dynflow::Coordinator::ExecutionLock.name).each do |l| exists = w.persistence.load_execution_plan(l.execution_plan_id) rescue nil unless exists puts "#{l.execution_plan_id} doesn't exist: deleting the lock" w.coordinator.delete_record(l) end end; puts "finished" EOF After going so, the invalid locks should be removed from the system, and the nest time foreman-tasks service is started, this issue should be resolved.
I followed the steps and found error msgs on logs: grep worlds production.log 2017-04-03 12:42:02 [foreman-tasks/dynflow] [E] invalid worlds found {"a97cd2c2-a86b-4309-aa0c-edd7ed1c6c9f"=>"Value (NilClass) '' is not any of: Dynflow::ExecutionPlan::Steps::Abstract.", "22cf8a95-f734-4145-9af9-f2dd0baf93e7"=>:valid} | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.4/lib/dynflow/world.rb:328:in `block in worlds_validity_check' | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.4/lib/dynflow/world.rb:322:in `worlds_validity_check' 2017-04-03 12:44:37 [foreman-tasks/dynflow] [E] invalid worlds found {"a97cd2c2-a86b-4309-aa0c-edd7ed1c6c9f"=>"Value (NilClass) '' is not any of: Dynflow::ExecutionPlan::Steps::Abstract.", "22cf8a95-f734-4145-9af9-f2dd0baf93e7"=>:valid} | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.4/lib/dynflow/world.rb:328:in `block in worlds_validity_check' | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.4/lib/dynflow/world.rb:322:in `worlds_validity_check' 2017-04-03 12:45:15 [foreman-tasks/dynflow] [E] invalid worlds found {"a97cd2c2-a86b-4309-aa0c-edd7ed1c6c9f"=>"Value (NilClass) '' is not any of: Dynflow::ExecutionPlan::Steps::Abstract.", "22cf8a95-f734-4145-9af9-f2dd0baf93e7"=>:valid} I have tested it on Satellite 6.2.9 snap 2. Maybe we had not cherry-picked this fix. So I am moving this back to ASSIGNED
Providing more info on https://bugzilla.redhat.com/show_bug.cgi?id=1390931#c20. I started repo sync to start the task. After testing steps I tried start the synchronization again with no success. Same error in logs seems to prevent it: 2017-04-03 13:03:51 [foreman-tasks/dynflow] [E] invalid worlds found {"a97cd2c2-a86b-4309-aa0c-edd7ed1c6c9f"=>"Value (NilClass) '' is not any of: Dynflow::ExecutionPlan::Steps::Abstract.", "8b55723d-6b00-4b41-acdf-f7f90bfe0b48"=>:valid, "f9fac667-d073-4ace-a576-face1d515626"=>:valid, "e5dbdb21-8133-41a6-9941-70af16fd95ed"=>:valid} | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.4/lib/dynflow/world.rb:328:in `block in worlds_validity_check' | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.4/lib/dynflow/world.rb:322:in `worlds_validity_check' 2017-04-03 13:04:28 [foreman-tasks/dynflow] [E] invalid worlds found {"a97cd2c2-a86b-4309-aa0c-edd7ed1c6c9f"=>"Value (NilClass) '' is not any of: Dynflow::ExecutionPlan::Steps::Abstract.", "8b55723d-6b00-4b41-acdf-f7f90bfe0b48"=>:valid, "f9fac667-d073-4ace-a576-face1d515626"=>:valid, "e5dbdb21-8133-41a6-9941-70af16fd95ed"=>:valid}
I was able to reproduce (see http://projects.theforeman.org/issues/19146), when I deleted data from dynflow_steps, but not from dynflow_execution_plans. Although I believe this was a bit different situation as before, I'm ok with keeping it as part of this BZ, tracking though as differnt issue in upstream. The proposed patch is available in https://github.com/Dynflow/dynflow/pull/227
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1191
No more rerror messages found after running steps from comment #21. So I am moving this bug to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1553
Connecting redmine issue http://projects.theforeman.org/issues/20002 from this bug