Bug 2028015
Summary: | postfix master.pid file not deleted at postfix stop if postfix had been started using postfix start command | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Roger Sewell <roger.sewell> | ||||
Component: | postfix | Assignee: | Jaroslav Škarvada <jskarvad> | ||||
Status: | CLOSED ERRATA | QA Contact: | František Hrdina <fhrdina> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 8.4 | CC: | fhrdina, fs3000, jskarvad, lvrabec, mmalik, psklenar, ssekidde, webmaster, zpytela | ||||
Target Milestone: | rc | Keywords: | AutoVerified, Patch, Reopened, Triaged | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | postfix-3.5.8-4.el8 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2022-05-10 15:29:44 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Roger Sewell
2021-12-01 10:16:09 UTC
It seems it's caused by the SELinux policy, try with the 'setenforce 0'. Unfortunately, the postfix-script doesn't remove the pid file upon stop. It's using flock to check the exclusive lock to find out whether the pid file is valid. It's upstream design decision. The following happens here: - 'postfix start' creates and locks the /var/spool/postfix/pid/master.pid - 'postfis stop' unlocks the /var/spool/postfix/pid/master.pid - 'systemd start postfix' runs 'postfix start' - it runs '/usr/libexec/postfix/postfix-script start' - it runs '/usr/libexec/postfix/master -t' which tries to exclusively lock the /var/spool/postfix/pid/master.pid - the flock call fails because it's not allowed by the SELinux policy - '/usr/libexec/postfix/master -t' thinks that the flock fails because the daemon is already running - the condition propagates back to the systemd that signals failed service Caught AVCs and syscalls: type=AVC msg=audit(1638995374.054:368): avc: denied { read write } for pid=10520 comm="master" name="master.pid" dev="vda1" ino=14680387 scontext=system_u:system_r:postfix_master_t:s0 tcontext=unconfined_u:object_r:var_run_t:s0 tclass=file permissive=1 type=AVC msg=audit(1638995374.054:368): avc: denied { open } for pid=10520 comm="master" path="/var/spool/postfix/pid/master.pid" dev="vda1" ino=14680387 scontext=system_u:system_r:postfix_master_t:s0 tcontext=unconfined_u:object_r:var_run_t:s0 tclass=file permissive=1 type=SYSCALL msg=audit(1638995374.054:368): arch=c000003e syscall=257 success=yes exit=11 a0=ffffff9c a1=5583463b8200 a2=2 a3=0 items=0 ppid=10513 pid=10520 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="master" exe="/usr/libexec/postfix/master" subj=system_u:system_r:postfix_master_t:s0 key=(null)ARCH=x86_64 SYSCALL=openat AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root" type=AVC msg=audit(1638995374.054:369): avc: denied { getattr } for pid=10520 comm="master" path="/var/spool/postfix/pid/master.pid" dev="vda1" ino=14680387 scontext=system_u:system_r:postfix_master_t:s0 tcontext=unconfined_u:object_r:var_run_t:s0 tclass=file permissive=1 type=SYSCALL msg=audit(1638995374.054:369): arch=c000003e syscall=5 success=yes exit=0 a0=b a1=7fff8d35b030 a2=7fff8d35b030 a3=0 items=0 ppid=10513 pid=10520 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="master" exe="/usr/libexec/postfix/master" subj=system_u:system_r:postfix_master_t:s0 key=(null)ARCH=x86_64 SYSCALL=fstat AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root" type=AVC msg=audit(1638995374.054:370): avc: denied { lock } for pid=10520 comm="master" path="/var/spool/postfix/pid/master.pid" dev="vda1" ino=14680387 scontext=system_u:system_r:postfix_master_t:s0 tcontext=unconfined_u:object_r:var_run_t:s0 tclass=file permissive=1 type=SYSCALL msg=audit(1638995374.054:370): arch=c000003e syscall=73 success=yes exit=0 a0=b a1=6 a2=7ff4c7d914e0 a3=0 items=0 ppid=10513 pid=10520 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="master" exe="/usr/libexec/postfix/master" subj=system_u:system_r:postfix_master_t:s0 key=(null)ARCH=x86_64 SYSCALL=flock AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root" Without SELinux it works as expected, i.e. the 'master -t' detects that the daemon is not running and systemd starts the postfix service despite the existing pid file. Reassigning to the selinux-policy component. From SELinux POV, this is correct behaviour as the /usr/sbin/postfix command is executed in the caller domain which is in this case probably unconfined_t so no file transitions for the pid file apply. When the postfix service is handled by systemctl start and stop, there is no problem expected. First, I confirm that stopping and starting postfix using systemctl stop postfix.service systemctl start postfix.service works fine and then the pid file is correctly handled. Second, however, I would note that man postfix gives no suggestion that postfix stop and postfix start won't work properly. Given the above, it should contain at least a warning to use the systemctl commands for this purpose. Moreover ideally use of postfix stop or postfix start should either fail with an error message telling root to use the systemctl commands, or should work properly. Ok, reassigning back to postfix. After discussion with the SELinux guys, let's try workaround in the postfix service file first to run restorecon on the master.pid. Created attachment 1846159 [details]
Proposed fix
Jaroslav, Thank you for posting this proposed fix. It does partially work -- i.e. it removes the worst effect of this bug, which is that the postfix system didn't restart after a system reboot if postfix start and postfix stop had been used before. But it doesn't cause the tidy removal of the master.pid file after postfix stop. It may be that this is the effect you intended, and it is certainly a huge improvement on the previous sitution, which left an obscure bug not fixed by a reboot. What I did to test it: I inserted the line ExecStartPre=-/usr/sbin/restorecon -R /var/spool/postfix/pid/master.pid just before the other ExecStartPre lines in /usr/lib/systemd/system/postfix.service - i.e. I didn't recompile the package, just tested the effect of the final change to the system. I then started postfix.service by systemctl stop postfix.service systemctl daemon-reload systemctl restart postfix.service and looked at the pid file (ll is my alias for ls -l): ll -Z /var/spool/postfix/pid/master.pid -rw-------. 1 root root system_u:object_r:postfix_var_run_t:s0 33 Dec 14 09:36 /var/spool/postfix/pid/master.pid Then I ran postfix stop -- so far so good: ll -Z /var/spool/postfix/pid/master.pid ls: cannot access '/var/spool/postfix/pid/master.pid': No such file or directory Then I ran postfix start, followed by postfix stop (i.e. starting and stopping not via systemctl), and got ll -Z /var/spool/postfix/pid/master.pid -rw-------. 1 root root unconfined_u:object_r:var_run_t:s0 33 Dec 14 09:36 /var/spool/postfix/pid/master.pid i.e. file still exists and has wrong selinux context, when it should have been removed. But it does then restart on a system reboot or even just systemctl start postfix.service -- perhaps that was your only intention. I.e. it is a workaround which is certainly a huge improvement (and which I will leave in place), but it is not a clean fix. Thank you, Roger. (In reply to Roger Sewell from comment #10) > Jaroslav, > > Thank you for posting this proposed fix. > > It does partially work -- i.e. it removes the worst effect of this bug, > which is that the postfix system didn't restart after a system reboot if > postfix start and postfix stop had been used before. But it doesn't cause > the tidy removal of the master.pid file after postfix stop. It may be that > this is the effect you intended, and it is certainly a huge improvement on > the previous sitution, which left an obscure bug not fixed by a reboot. > > What I did to test it: I inserted the line > ExecStartPre=-/usr/sbin/restorecon -R /var/spool/postfix/pid/master.pid > just before the other ExecStartPre lines in > /usr/lib/systemd/system/postfix.service > - i.e. I didn't recompile the package, just tested the effect of the final > change to the system. > > I then started postfix.service by > systemctl stop postfix.service > systemctl daemon-reload > systemctl restart postfix.service > > and looked at the pid file (ll is my alias for ls -l): > ll -Z /var/spool/postfix/pid/master.pid > -rw-------. 1 root root system_u:object_r:postfix_var_run_t:s0 33 Dec 14 > 09:36 /var/spool/postfix/pid/master.pid > > Then I ran postfix stop -- so far so good: > ll -Z /var/spool/postfix/pid/master.pid > ls: cannot access '/var/spool/postfix/pid/master.pid': No such file or > directory > > Then I ran postfix start, followed by postfix stop (i.e. starting and > stopping not via systemctl), and got > ll -Z /var/spool/postfix/pid/master.pid > -rw-------. 1 root root unconfined_u:object_r:var_run_t:s0 33 Dec 14 09:36 > /var/spool/postfix/pid/master.pid > i.e. file still exists and has wrong selinux context, when it should have > been removed. > > But it does then restart on a system reboot or even just systemctl start > postfix.service -- perhaps that was your only intention. I.e. it is a > workaround which is certainly a huge improvement (and which I will leave in > place), but it is not a clean fix. > > Thank you, > Roger. Is it problem that the PID file isn't removed? The problem is two different approaches here: 1) postfix upstream: they do not delete the PID file, but checks its validity through the locking 2) systemd: they delete the invalid PID file It's usually not good approach to mix controlling of the services through the systemd and non-systemd interfaces and I think it's also unsupported in RHEL. For 1) we would have to patch postfix to delete the PID file, which would be diversion from the upstream behaviour which we are trying to avoid. Maybe upstream discussion could be opened. Jaroslav, No, it isn't a problem for me that the PID file isn't removed - it just didn't seem a clean solution. You have explained your dilemma, which I understand - I'm therefore happy with your fix. Thank you for the explanation - I hadn't realised about the two different upstreams. Roger. Switching the component to postfix as there seems to be where the work continues now, but will keep monitoring it in case selinux-policy needs updating, too. Cloning to RHEL-9. RHEL-9 lightweight clone bug 2055915. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (postfix bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2091 Just hit this bug, postfix-3.5.8-4.el8.x86_6 on 8.7. Not sure if this matters, but i did "systemctl enable --now postfix" instead of just "systemctl start postfix". (In reply to FS3000 from comment #24) > Just hit this bug, postfix-3.5.8-4.el8.x86_6 on 8.7. > > Not sure if this matters, but i did "systemctl enable --now postfix" instead > of just "systemctl start postfix". Please refer to https://bugzilla.redhat.com/show_bug.cgi?id=2162659#c3. |