Bug 1058654
| Summary: | systemd kills and restarts postgrey repeatedly | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Joel Uckelman <uckelman> |
| Component: | postgrey | Assignee: | Christopher Meng <i> |
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 20 | CC: | i, mark, matthias, scarlet, vg.aetera |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | postgrey-1.34-12.fc20 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1055638 | Environment: | |
| Last Closed: | 2014-02-15 06:57:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Joel Uckelman
2014-01-28 09:44:38 UTC
I do change the service file with more strict rules for f20+. One of the modification in the service file: +TimeoutSec=5 This will try to restart the service if timed out. I'm not sure why you failed to start the service. Looking into the log I saw that it was trying to setuid and gid, yes I added these in f20, I thought it was useful but now it seems that there probably exists a problem... Can you do these in the service file and report again? 1. Comment out the restart timeout line mentioned above, then try starting the service. 2. Comment out --group=postgrey \ --user=postgrey \ And try again (In reply to Christopher Meng from comment #1) > I do change the service file with more strict rules for f20+. > > One of the modification in the service file: > > +TimeoutSec=5 > > This will try to restart the service if timed out. I'm not sure why you > failed to start the service. > > Looking into the log I saw that it was trying to setuid and gid, yes I added > these in f20, I thought it was useful but now it seems that there probably > exists a problem... > > Can you do these in the service file and report again? > > 1. Comment out the restart timeout line mentioned above, then try starting > the service. I commented out TimeoutSec in /usr/lib/systemd/system/postgrey.service. Then: systemctl dameon-reload systemctl stop postgrey systemctl start postgrey 'systemctl start postgrey' waits a long time and then reports failure: "Job for postgrey.service failed. See 'systemctl status postgrey.service' and 'journalctl -xn' for details." [root@one ~]# journalctl -xn -- Logs begin at Sat 2014-01-25 13:51:44 CET, end at Tue 2014-01-28 11:39:28 CET Jan 28 11:39:27 one postgrey[28130]: 2014/01/28-11:39:27 Server closing! Jan 28 11:39:27 one systemd[1]: Failed to start Postfix Greylisting Service. -- Subject: Unit postgrey.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit postgrey.service has failed. -- -- The result is failed. Jan 28 11:39:27 one systemd[1]: Unit postgrey.service entered failed state. Jan 28 11:39:28 one systemd[1]: postgrey.service holdoff time over, scheduling r Jan 28 11:39:28 one systemd[1]: Stopping Postfix Greylisting Service... -- Subject: Unit postgrey.service has begun shutting down -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit postgrey.service has begun shutting down. Jan 28 11:39:28 one systemd[1]: Starting Postfix Greylisting Service... -- Subject: Unit postgrey.service has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit postgrey.service has begun starting up. Jan 28 11:39:28 one postgrey[28137]: 2014/01/28-11:39:28 postgrey (type Net::Ser Jan 28 11:39:28 one postgrey[28137]: Binding to UNIX socket file "/var/spool/pos Jan 28 11:39:28 one postgrey[28137]: Setting gid to "991 991" Jan 28 11:39:28 one postgrey[28137]: Setting uid to "993" This appears just to make the kill-restart cycle longer. > 2. Comment out > > --group=postgrey \ --user=postgrey \ > > And try again I wasn't sure if you meant #1 and #2 to be applied independently or cumulatively. With #1 and #2 together, I get the same result as with #1 alone. With #2 alone, I get the same kill-restart cycle as I do with no modifications to postgrey.service. Also: In no case do I see a pidfile being written to /var/run. If I start the command from the command line, I get a /var/run/postgrey.pid. But, if I ask systemd to start postgrey, it does not generate /var/run/postgrey.pid, and it seems likely that systemd is believing that postgrey didn't properly start as a result? It looks like it is the --greylist-text option that is causing the problem. In particular, this seems to work: --greylist-text="Greylisted_for_%%s_seconds" \ But, this seems to fail: --greylist-text="Greylisted_for_%%s_seconds" \ I think there is an incorrect expectation around what type of content is allowed to be contained in ExecStartPre=. In particular, this seems to also fail: --greylist-text='Greylisted for %%s seconds' \ But, this WORKS: '--greylist-text=Greylisted for %%s seconds' \ I'm not a systemd expert, but I would say that: --parameter="value" Is only valid as /bin/sh syntax and NOT valid as systemd ExecStartPre= syntax. So, I suggest using this: '--greylist-text=Greylisted for %%s seconds' \ Err typo above, sorry... the second failure should have been: --greylist-text="Greylisted for %%s seconds" \ That is, the original syntax in the current RPM version that fails... And, it's ExecStart=, not ExecStartPre=. Eyes bleary. Sorry for the noise, but didn't want to leave it incorrect. (In reply to Mark Mielke from comment #5) > > So, I suggest using this: > > '--greylist-text=Greylisted for %%s seconds' \ I concur, this does appear to work. The pidfile is created and systemd doesn't get into a loop of killing and restarting postgrey. *** Bug 1059738 has been marked as a duplicate of this bug. *** Hmm... Is it a flaw in systemd? I think that I need to report a new bug against systemd. postgrey-1.34-12.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/postgrey-1.34-12.fc20 Package postgrey-1.34-12.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing postgrey-1.34-12.fc20' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-2114/postgrey-1.34-12.fc20 then log in and leave karma (feedback). The fix works for me. You can increase this update karma to force pushing to stable if you log in to admin.fedoraproject.org before leaving a feedback. postgrey-1.34-12.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report. |