Bug 1665470
| Summary: | Dynflow executor termination may hang if there is an action which keeps the executor occupied | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Ivan Necas <inecas> | ||||
| Component: | Tasks Plugin | Assignee: | satellite6-bugs <satellite6-bugs> | ||||
| Status: | CLOSED ERRATA | QA Contact: | jcallaha | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.3.0 | CC: | aruzicka, inecas, jcallaha, zhunting | ||||
| Target Milestone: | 6.4.2 | Keywords: | Triaged | ||||
| Target Release: | Unused | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | tfm-rubygem-dynflow-1.0.5.3-1 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1654975 | Environment: | |||||
| Last Closed: | 2019-02-13 19:08:21 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1654975 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Comment 10
jcallaha
2019-01-27 02:04:38 UTC
Created attachment 1524247 [details]
ongoing create
Totally forgot to attach the screenshot!
No, I see no evidence that the executor was attempted to be restarted.
However, I only know the dynflow_executor log location. If you have a more relevant one, I can check that.
As of now, the task is still "going".
With `sleep` in place, the tasks will not restart on it's own: the memory limit needs to be set accordingly and the threshold needs to be reached So while doing https://bugzilla.redhat.com/show_bug.cgi?id=1665470#c9, reproducer steps from https://bugzilla.redhat.com/show_bug.cgi?id=1654217#c0 need to be performed as well to see the behavior when the memory recycler restarts the executor. So the reproducer steps should be: 1. setup the memory limit 2. follow https://bugzilla.redhat.com/show_bug.cgi?id=1665470#c9 to simulate the stuck task 3. finish reproducer from https://bugzilla.redhat.com/show_bug.cgi?id=1654217#c0 to hit the memory limit expectation: the dynflowd service would get restarted, and the stuck task would eventually end up in paused state Verified in Satellite 6.4.2 Snap 1.
Followed the revised steps outlined in #13
The memory limit was reached, after publishing 10 content views and performing validation syncs on 6 RHEL repositories.
In the log, I can see that the executor reaches its limit and is then restarted after the error.
The product create task was then moved to a paused state.
E, [2019-02-06T15:36:09.507961 #2769] ERROR -- /parallel-executor-core: cannot accept event: Dynflow::Director::Event[execution_plan_id: 03e8929b-e814-4175-9f8e-08c7e8876351, step_id: 159, event: Dynflow::Action::Polling::Poll, result: <#Concurrent::Edge::CompletableFuture:0x7f7aa63285e8 pending>] core is terminating (Dynflow::Error)
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.0.5.5/lib/dynflow/executors/parallel/core.rb:40:in `handle_event'
...
/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
World has been terminatedExiting
Starting Rails environment
Starting dynflow with the following options: {:rails_root=>"/usr/share/foreman", :process_name=>"dynflow_executor", :pid_dir=>"/usr/share/foreman/tmp/pids", :log_dir=>"/usr/share/foreman/log", :wait_attempts=>300, :wait_sleep=>1, :executors_count=>1, :memory_limit=>419430400.0, :memory_init_delay=>60, :memory_polling_interval=>60}
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/concerns/content_facet_host_extensions.rb:7: warning: already initialized constant Katello::Concerns::ContentFacetHostExtensions::ERRATA_STATUS_MAP
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/concerns/content_facet_host_extensions.rb:7: warning: previous definition of ERRATA_STATUS_MAP was here
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/concerns/content_facet_host_extensions.rb:14: warning: already initialized constant Katello::Concerns::ContentFacetHostExtensions::TRACE_STATUS_MAP
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/concerns/content_facet_host_extensions.rb:14: warning: previous definition of TRACE_STATUS_MAP was here
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/concerns/subscription_facet_host_extensions.rb:13: warning: already initialized constant Katello::Concerns::SubscriptionFacetHostExtensions::SUBSCRIPTION_STATUS_MAP
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/concerns/subscription_facet_host_extensions.rb:13: warning: previous definition of SUBSCRIPTION_STATUS_MAP was here
/opt/theforeman/tfm/root/usr/share/gems/gems/foreman_docker-4.1.0/app/controllers/api/v2/containers_controller.rb:107: warning: constant ::Fixnum is deprecated
Everything ready for world: cd0889f8-aff2-4d01-b98d-6510c25c6e7c
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0345 |