Bug 2234512

Summary: Postfix unit start/restart timeouts when change in queue_directory parameter
Product: Red Hat Enterprise Linux 8 Reporter: Shreyas Mahangade <smahanga>
Component: postfixAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED MIGRATED QA Contact: František Hrdina <fhrdina>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.7CC: fhrdina
Target Milestone: rcKeywords: MigratedToJIRA
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-21 20:24:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shreyas Mahangade 2023-08-24 17:07:56 UTC
* Description of problem:
 =======================

If parameter queue_directory is defined and postfix is started via systemctl with `systemctl restart postfix` causes timeout issue.

~~~
[root@localhost postfix]# systemctl start postfix
Job for postfix.service failed because a timeout was exceeded.
See "systemctl status postfix.service" and "journalctl -xe" for details.

[root@localhost postfix]# systemctl status postfix.service
● postfix.service - Postfix Mail Transport Agent
   Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled; vendor preset: disabled)
   Active: failed (Result: timeout) since Fri 2023-08-18 05:40:23 BST; 1min 9s ago
  Process: 62423 ExecStart=/usr/sbin/postfix start (code=exited, status=0/SUCCESS)
  Process: 62421 ExecStartPre=/usr/libexec/postfix/chroot-update (code=exited, status=0/SUCCESS)
  Process: 62418 ExecStartPre=/usr/libexec/postfix/aliasesdb (code=exited, status=0/SUCCESS)
  Process: 62416 ExecStartPre=/usr/sbin/restorecon -R /var/spool/postfix/pid/master.pid (code=exited, status=0/SUCCESS)

Aug 18 05:40:23 localhost systemd[1]: postfix.service: start operation timed out. Terminating.
Aug 18 05:40:23 localhost systemd[1]: postfix.service: Failed with result 'timeout'.
Aug 18 05:40:23 localhost systemd[1]: Failed to start Postfix Mail Transport Agent.
~~~

* Version-Release number of selected component (if applicable):
 =============================================================

Producible in RHEL8 and RHEL9 but not in RHEL7

* How reproducible:
 =================

Always

* Steps to Reproduce:
  ==================

1. Make a custom queue directory and make sure correct permissions and context set.

~~~
# mkdir /mailqueue
~~~

2. Change parameter in /etc/postfix/main.cf

~~~
#queue_directory = /var/spool/postfix
queue_directory = /mailqueue
~~~

3. Restart postfix

* Actual results:
  ==============

- Unit fails with 'timeout'


* Expected results:

- Postfix starts without issue.

* Additional info:
 ================

=> Not reproducible in RHEL7. Only in RHEL8 and RHEL9

=> Upon changing queue_directory, postfix creates pid inside $queue_directory/pid/ but systemd still waiting for /var/spool/postfix/pid/master.pid which makes unit in activating state and fails when timeout (90 seconds) reaches

=> We can manually start the postfix with `# postfix start` command

=> Workaround:

1. Copy the unit file to /etc/ which is recommended while making changes to unit file:

~~~
# cp /usr/lib/systemd/system/postfix.service /etc/systemd/system/postfix.service
# vi /etc/systemd/system/postfix.service
~~~

2. Change below lines:

~~~
PIDFile=/mailqueue/pid/master.pid
ExecStartPre=-/usr/sbin/restorecon -R /mailqueue/pid/master.pid
~~~

3. save the file

4. reload daemon and restart the service

# systemctl daemon-reload
# systemctl restart postfix

Comment 1 Jaroslav Škarvada 2023-08-31 12:37:40 UTC
Thanks for the information. I am afraid there is not much we can do about it especially if we don't want to divert from the postfix upstream. Such custom setup already requires some manual tuning (e.g. at least setting the SELinux labels), thus we could just document it, i.e. document that the custom systemd service file or systemd override is required in such case.

The other possible solution could be to start postfix on the foreground by systemd and ignore the PIDs. This could solve other possible problems coming from the use of the forking service. But it could also bring some new problems thus I wouldn't recommend going this way in the stable RHEL, we could change this for next RHELs.

Comment 2 RHEL Program Management 2023-09-21 20:21:49 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 3 RHEL Program Management 2023-09-21 20:24:54 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.