Bug 1406489 - 500 Error: Remote Execution fails upon re-run of an existing job
Summary: 500 Error: Remote Execution fails upon re-run of an existing job
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Remote Execution
Version: 6.2.6
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: Unspecified
Assignee: Shimon Shtein
QA Contact: Ivan Necas
URL:
Whiteboard:
: 1441119 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-20 16:55 UTC by Marc Richter
Modified: 2021-03-11 14:52 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-21 16:54:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot after verification (40.67 KB, image/png)
2017-08-16 08:42 UTC, Ivan Necas
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 18316 0 Normal Closed Rerun job with failed hosts fails with "Stack level too deep" in the log. 2020-11-17 13:38:21 UTC

Description Marc Richter 2016-12-20 16:55:36 UTC
Created attachment 1233935 [details]
Status of remote execution jobs before attempting re-run

Description of problem: Remote execution job service rhsmcertd ran on 835 hosts. When customer tried to rerun the failed jobs, the error message "Oops, we're sorry but something went wrong stack level too deep" appeared. 


Version-Release number of selected component (if applicable): Satellite 6.2.6. Was occurring with 6.2.4 as well.


How reproducible: Attempt to re-run failed remote execution jobs with more than a handful of hosts



Actual results:
Error page

Expected results:
Jobs re-run

Additional info:

Comment 1 Marc Richter 2016-12-20 16:58:56 UTC
Additional info from customer - error seems to happen when there are more than 240 failed jobs that need to be re-run.

Comment 3 Shimon Shtein 2017-01-31 09:29:08 UTC
Couldn't reproduce this bug.
Is it possible to attach foreman-debug output?

Comment 4 Marc Richter 2017-01-31 14:59:30 UTC
What build are you trying to reproduce on? One of the errata notes in 6.2.7 led me to believe that this may be fixed already.

Comment 5 Shimon Shtein 2017-01-31 16:16:37 UTC
On latest snap. What did you see there?

Comment 6 Marc Richter 2017-01-31 16:20:09 UTC
From the 6.2.7 notes:

* Remote Execution against many hosts was causing errors to appear. This 
case is now handled correctly. (BZ#1367606, BZ#1372708) 

Customer is upgrading to 6.2.7 this Friday. I'm curious to see if the errors go away.

Comment 7 Shimon Shtein 2017-01-31 16:30:47 UTC
Sounds promising.
Let's wait for Friday and see if it helps.
If it doesn't, please attach foreman-debug to this BZ.

Thanks!

Comment 8 Marc Richter 2017-01-31 16:32:32 UTC
Yep, that was my Evil Plan. ;-)

Comment 12 Shimon Shtein 2017-04-06 06:25:19 UTC
Connecting redmine issue http://projects.theforeman.org/issues/18316 from this bug

Comment 13 Adam Ruzicka 2017-04-11 10:01:06 UTC
*** Bug 1441119 has been marked as a duplicate of this bug. ***

Comment 19 Ivan Necas 2017-08-16 08:41:54 UTC

Verification steps:

1. prepare large amount of fake hosts
cat <<END | bundle exec rails console
User.current = User.first
group = Hostgroup.unscoped.find_or_create_by(:name => 'fakes')
group.save
location = Location.first
organization = Organization.first
group.locations << location
group.organizations << organization
1000.times do |i|
  i = i+11
  puts i
  h = Host.new(:name => "host-#{i+10}.sat.test")
  h.hostgroup_id = group.id
  h.organization = organization
  h.location = location
  h.save!
end
END
2. run any job against the `~ test` hosts
3. wait until it fails
4. use 'Rerun failed'
5. form is displayed properly and running the job works (after selecting the type of query, that we track in https://bugzilla.redhat.com/show_bug.cgi?id=1481981

Comment 20 Ivan Necas 2017-08-16 08:42:22 UTC
Created attachment 1314000 [details]
Screenshot after verification

Comment 21 Satellite Program 2018-02-21 16:54:37 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> > 
> > For information on the advisory, and where to find the updated files, follow the link below.
> > 
> > If the solution does not work for you, open a new bug report.
> > 
> > https://access.redhat.com/errata/RHSA-2018:0336


Note You need to log in before you can comment on or make changes to this bug.