Bug 2154917

Summary: foreman-maintain restart does not always ensure all services are started
Product: Red Hat Satellite Reporter: Pavel Moravec <pmoravec>
Component: Satellite MaintainAssignee: Eric Helms <ehelms>
Status: CLOSED ERRATA QA Contact: Griffin Sullivan <gsulliva>
Severity: high Docs Contact:
Priority: high    
Version: 6.11.4CC: aruzicka, egolov, ehelms, gsulliva, pcreech
Target Milestone: 6.14.0Keywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: foreman-maintain-1.3.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 14:18:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Moravec 2022-12-19 15:39:55 UTC
Description of problem:
When restarting just foreman service, any new task invoked after the restart gets stuck in planned/pending forever. 


Version-Release number of selected component (if applicable):
6.11.4
6.12


How reproducible:
100%


Steps to Reproduce:
1. Run a REX job (say execute "date command) just to see it works well.
2. satellite-maintain service restart --only=foreman.service
3. Repeat 1
4. satellite-maintain service restart --only=dynflow-sidekiq@orchestrator,dynflow-sidekiq@worker-1,dynflow-sidekiq@worker-2,dynflow-sidekiq@worker-hosts-queue-1
5. Repeat 1


Actual results:
3. gets stuck forever
5. jobs+tasks start to work again (including those invoked in 3)


Expected results:
3. tasks are completed as usual


Additional info:
Observation: Just restarting dynflow orchestrator service is not sufficient.

Comment 1 Adam Ruzicka 2023-01-09 09:24:42 UTC
dynflow-sidekiq@* services are tied to the foreman service using the PartOf stanza on the systemd service definition level. For some reason, satellite-maintain service restart --only foreman.service brings down the entire group, but then starts only foreman. Is it possible that satellite-maintain does a stop followed by a restart instead of a direct restart?

Native systemctl restart foreman does not suffer from this issue.

Comment 2 Adam Ruzicka 2023-01-09 09:48:32 UTC
Alternatively, setting WantedBy=foreman.service on dynflow-sidekiq@* seems to work too, although the services have to be re-enabled in order for the changes to propagate.

Comment 3 Evgeni Golov 2023-01-10 12:44:49 UTC
IMHO `maintain service restart` should use `systemctl restart`

Comment 5 Eric Helms 2023-01-26 13:50:07 UTC
*** Bug 2067120 has been marked as a duplicate of this bug. ***

Comment 7 Eric Helms 2023-06-05 13:20:53 UTC
Created redmine issue https://projects.theforeman.org/issues/36467 from this bug

Comment 8 Bryan Kearney 2023-06-05 16:02:48 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/36467 has been resolved.

Comment 9 Griffin Sullivan 2023-06-21 16:11:10 UTC
Verified on 6.14 snap 4

satellite-maintain service restart --only=foreman.service is correctly restarting services

Steps:
1. Run a REX job (say execute "date command) just to see it works well.
2. satellite-maintain service restart --only=foreman.service
3. Repeat 1

Results:
Both executions are successful.

Comment 12 errata-xmlrpc 2023-11-08 14:18:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.14 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6818