Red Hat Bugzilla – Bug 1372708
dynflow may not start with many tasks in pending state, may segfault
Last modified: 2017-05-24 06:43:25 EDT
Description of problem: This bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1335105 and https://bugzilla.redhat.com/show_bug.cgi?id=1320794. If a user has many pending tasks and tries to start foreman-tasks, they may encounter the following stack trace: /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/dataset/actions.rb:139: [BUG] Stack consistency error (sp: 325, bp: 324) ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux-gnu] -- Control frame information ----------------------------------------------- c:0075 p:0011 s:0325 e:000323 BLOCK /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/dataset/actions.rb:139 c:0074 p:0008 s:0321 e:000320 BLOCK /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/adapters/postgres.rb:655 c:0073 p:0021 s:0318 e:000317 BLOCK /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/adapters/postgres.rb:878 [FINISH] additionally, they may encounter this error: E, [2016-09-01T16:50:22.444036 #27074] ERROR -- /connector-database-core: No executor available (Dynflow::Error) E, [2016-09-01T16:50:22.444318 #27074] ERROR -- /client-dispatcher: No executor available (Dynflow::Error) Version-Release number of selected component (if applicable): 6.2.1 (tfm-rubygem-dynflow-0.8.11-1.el6sat.noarch) How reproducible: not every time, race condition Steps to Reproduce: 1. get a large number of tasks into pending state (not sure on best way to do this) 2. stop foreman-tasks 3. start foreman-tasks NOTE: this bz is for a race condition, you may need to test multiple times to repro the bz
proposed fix: https://github.com/Dynflow/dynflow/pull/198
Created redmine issue http://projects.theforeman.org/issues/16486 from this bug
Verified in Satellite 6.2.7 Snap 3. I used 328 simultaneous paused repo syncs to test this bug. Verification steps are below. The tasks immediately started being handled when the foreman-tasks service was restarted. Due to the amount of repo syncs it took approximately 6 hours to process them all. 1. systemctl stop pulp_workers 2. start a large number of repository sync tasks. - You can use this url for repo discovery http://pubmirror1.math.uh.edu/fedora-buffet/fedora/linux/ 3. wait for all tasks to move into pending state 4. query number of pending tasks (see attached screenshot 1) 5. systemctl stop foreman-tasks 6. systemctl start pulp_workers && systemctl start foreman-tasks 7. watch the pending sync tasks drop off the query (see attachment 2 [details])
Created attachment 1244141 [details] screenshot 1
Created attachment 1244142 [details] screenshot 2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0197