Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1372708

Summary: dynflow may not start with many tasks in pending state, may segfault
Product: Red Hat Satellite Reporter: Chris Duryee <cduryee>
Component: Tasks PluginAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: jcallaha
Severity: medium Docs Contact:
Priority: high    
Version: 6.2.0CC: bbuckingham, bkearney, chrobert, daniele, inecas, jcallaha, kdixon, mmccune, oshtaier, pdwyer, rplevka, unwosu, zhunting
Target Milestone: UnspecifiedKeywords: PrioBumpField, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tfm-rubygem-dynflow-0.8.13.3-2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1406080 (view as bug list) Environment:
Last Closed: 2017-01-26 10:42:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1406080    
Attachments:
Description Flags
screenshot 1
none
screenshot 2 none

Description Chris Duryee 2016-09-02 12:45:40 UTC
Description of problem:

This bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1335105 and https://bugzilla.redhat.com/show_bug.cgi?id=1320794.

If a user has many pending tasks and tries to start foreman-tasks, they may encounter the following stack trace:


/opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/dataset/actions.rb:139:
[BUG] Stack consistency error (sp: 325, bp: 324)
ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux-gnu]

-- Control frame information -----------------------------------------------
c:0075 p:0011 s:0325 e:000323 BLOCK
/opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/dataset/actions.rb:139
c:0074 p:0008 s:0321 e:000320 BLOCK
/opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/adapters/postgres.rb:655
c:0073 p:0021 s:0318 e:000317 BLOCK
/opt/theforeman/tfm/root/usr/share/gems/gems/sequel-4.20.0/lib/sequel/adapters/postgres.rb:878
[FINISH]

additionally, they may encounter this error:


E, [2016-09-01T16:50:22.444036 #27074] ERROR --
/connector-database-core: No executor available (Dynflow::Error)

E, [2016-09-01T16:50:22.444318 #27074] ERROR -- /client-dispatcher: No
executor available (Dynflow::Error)


Version-Release number of selected component (if applicable): 6.2.1 (tfm-rubygem-dynflow-0.8.11-1.el6sat.noarch)

How reproducible: not every time, race condition


Steps to Reproduce:
1. get a large number of tasks into pending state (not sure on best way to do this)
2. stop foreman-tasks
3. start foreman-tasks

NOTE: this bz is for a race condition, you may need to test multiple times to repro the bz

Comment 1 Chris Duryee 2016-09-02 12:48:26 UTC
proposed fix: https://github.com/Dynflow/dynflow/pull/198

Comment 3 Ivan Necas 2016-09-08 11:56:54 UTC
Created redmine issue http://projects.theforeman.org/issues/16486 from this bug

Comment 8 jcallaha 2017-01-25 03:44:45 UTC
Verified in Satellite 6.2.7 Snap 3. 
I used 328 simultaneous paused repo syncs to test this bug. Verification steps are below. The tasks immediately started being handled when the foreman-tasks service was restarted. Due to the amount of repo syncs it took approximately 6 hours to process them all.

1. systemctl stop pulp_workers
2. start a large number of repository sync tasks.
   - You can use this url for repo discovery http://pubmirror1.math.uh.edu/fedora-buffet/fedora/linux/
3. wait for all tasks to move into pending state
4. query number of pending tasks (see attached screenshot 1)
5. systemctl stop foreman-tasks
6. systemctl start pulp_workers && systemctl start foreman-tasks
7. watch the pending sync tasks drop off the query (see attachment 2 [details])

Comment 9 jcallaha 2017-01-25 03:45:16 UTC
Created attachment 1244141 [details]
screenshot 1

Comment 10 jcallaha 2017-01-25 03:45:50 UTC
Created attachment 1244142 [details]
screenshot 2

Comment 12 errata-xmlrpc 2017-01-26 10:42:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0197