Bug 2154917

Summary:	foreman-maintain restart does not always ensure all services are started
Product:	Red Hat Satellite	Reporter:	Pavel Moravec <pmoravec>
Component:	Satellite Maintain	Assignee:	Eric Helms <ehelms>
Status:	CLOSED ERRATA	QA Contact:	Griffin Sullivan <gsulliva>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.11.4	CC:	aruzicka, egolov, ehelms, gsulliva, pcreech
Target Milestone:	6.14.0	Keywords:	Triaged
Target Release:	Unused
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	foreman-maintain-1.3.2	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-11-08 14:18:10 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Pavel Moravec 2022-12-19 15:39:55 UTC

Description of problem:
When restarting just foreman service, any new task invoked after the restart gets stuck in planned/pending forever. 


Version-Release number of selected component (if applicable):
6.11.4
6.12


How reproducible:
100%


Steps to Reproduce:
1. Run a REX job (say execute "date command) just to see it works well.
2. satellite-maintain service restart --only=foreman.service
3. Repeat 1
4. satellite-maintain service restart --only=dynflow-sidekiq@orchestrator,dynflow-sidekiq@worker-1,dynflow-sidekiq@worker-2,dynflow-sidekiq@worker-hosts-queue-1
5. Repeat 1


Actual results:
3. gets stuck forever
5. jobs+tasks start to work again (including those invoked in 3)


Expected results:
3. tasks are completed as usual


Additional info:
Observation: Just restarting dynflow orchestrator service is not sufficient.

Comment 1 Adam Ruzicka 2023-01-09 09:24:42 UTC

dynflow-sidekiq@* services are tied to the foreman service using the PartOf stanza on the systemd service definition level. For some reason, satellite-maintain service restart --only foreman.service brings down the entire group, but then starts only foreman. Is it possible that satellite-maintain does a stop followed by a restart instead of a direct restart?

Native systemctl restart foreman does not suffer from this issue.

Comment 2 Adam Ruzicka 2023-01-09 09:48:32 UTC

Alternatively, setting WantedBy=foreman.service on dynflow-sidekiq@* seems to work too, although the services have to be re-enabled in order for the changes to propagate.

Comment 3 Evgeni Golov 2023-01-10 12:44:49 UTC

IMHO `maintain service restart` should use `systemctl restart`

Comment 5 Eric Helms 2023-01-26 13:50:07 UTC

*** Bug 2067120 has been marked as a duplicate of this bug. ***

Comment 7 Eric Helms 2023-06-05 13:20:53 UTC

Created redmine issue https://projects.theforeman.org/issues/36467 from this bug

Comment 8 Bryan Kearney 2023-06-05 16:02:48 UTC

Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/36467 has been resolved.

Comment 9 Griffin Sullivan 2023-06-21 16:11:10 UTC

Verified on 6.14 snap 4

satellite-maintain service restart --only=foreman.service is correctly restarting services

Steps:
1. Run a REX job (say execute "date command) just to see it works well.
2. satellite-maintain service restart --only=foreman.service
3. Repeat 1

Results:
Both executions are successful.

Comment 12 errata-xmlrpc 2023-11-08 14:18:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.14 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6818