Created attachment 1233935 [details] Status of remote execution jobs before attempting re-run Description of problem: Remote execution job service rhsmcertd ran on 835 hosts. When customer tried to rerun the failed jobs, the error message "Oops, we're sorry but something went wrong stack level too deep" appeared. Version-Release number of selected component (if applicable): Satellite 6.2.6. Was occurring with 6.2.4 as well. How reproducible: Attempt to re-run failed remote execution jobs with more than a handful of hosts Actual results: Error page Expected results: Jobs re-run Additional info:
Additional info from customer - error seems to happen when there are more than 240 failed jobs that need to be re-run.
Couldn't reproduce this bug. Is it possible to attach foreman-debug output?
What build are you trying to reproduce on? One of the errata notes in 6.2.7 led me to believe that this may be fixed already.
On latest snap. What did you see there?
From the 6.2.7 notes: * Remote Execution against many hosts was causing errors to appear. This case is now handled correctly. (BZ#1367606, BZ#1372708) Customer is upgrading to 6.2.7 this Friday. I'm curious to see if the errors go away.
Sounds promising. Let's wait for Friday and see if it helps. If it doesn't, please attach foreman-debug to this BZ. Thanks!
Yep, that was my Evil Plan. ;-)
Connecting redmine issue http://projects.theforeman.org/issues/18316 from this bug
*** Bug 1441119 has been marked as a duplicate of this bug. ***
Verification steps: 1. prepare large amount of fake hosts cat <<END | bundle exec rails console User.current = User.first group = Hostgroup.unscoped.find_or_create_by(:name => 'fakes') group.save location = Location.first organization = Organization.first group.locations << location group.organizations << organization 1000.times do |i| i = i+11 puts i h = Host.new(:name => "host-#{i+10}.sat.test") h.hostgroup_id = group.id h.organization = organization h.location = location h.save! end END 2. run any job against the `~ test` hosts 3. wait until it fails 4. use 'Rerun failed' 5. form is displayed properly and running the job works (after selecting the type of query, that we track in https://bugzilla.redhat.com/show_bug.cgi?id=1481981
Created attachment 1314000 [details] Screenshot after verification
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. > > > > For information on the advisory, and where to find the updated files, follow the link below. > > > > If the solution does not work for you, open a new bug report. > > > > https://access.redhat.com/errata/RHSA-2018:0336