Bug 1275078

Summary:	postfix mbox default size of 50MB causes mail loss without clear logging
Product:	Red Hat Enterprise Linux 7	Reporter:	Paul Wouters <pwouters>
Component:	postfix	Assignee:	Jaroslav Škarvada <jskarvad>
Status:	CLOSED WONTFIX	QA Contact:	qe-baseos-daemons
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.3	CC:	extras-qa, jskarvad, thozza
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1275076	Environment:
Last Closed:	2019-12-06 16:09:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1275076
Bug Blocks:	1400961, 1472751

Description Paul Wouters 2015-10-25 15:43:07 UTC

+++ This bug was initially created as a clone of Bug #1275076 +++

Description of problem:

When using mbox, postfix per default limits files in /var/mail to 50MB. procmail as the LDA is prevented from writing, and returns EX_CANTCREAT which causes the mail to fail with a permanent 5.x.x error. It is lost and never retried for delivery. This all happened while my /var/mail had 300G free disk space.

Either the default maximum should be significantly increased (eg 500) or should take into account the number of users and the amount of free spool disk space. Or should be disabled completely per default. The latter can be done using mailbox_size_limit=0

This was a bug that took me 6 months of hunting down because of the bad logging between procmail and postfix with no clue as to the reason why procmail dies with "can't create user output file."

I've filed a bug for procmail to change its return value to a temporary failure code - See #1275071 and #1275072

At a minimum I think we need to bump the 50Mb significantly to avoid breaking simple small postfix installs with a handful of users and a modern 1T disk.

Comment 3 Jaroslav Škarvada 2016-10-25 12:51:28 UTC

Regarding the logging I agree that it should be improved, but I think we shouldn't divert from the upstream defaults especially for the stable product.

I also think that 'mailbox limit reached' is permanent error, not transient (like DNS failure, connectivity error, etc.). It's fatal error that will not just go away without admin/mailbox owner intervention on the SMTP server. There are various opinions about it, but according to RFC2821:

"It is difficult to assign a meaning to "transient" when two different sites (receiver- andsender-SMTP agents) must agree on the interpretation.  Each reply
in this category might have a different time value, but the SMTP
client is encouraged to try again.  A rule of thumb to determine
whether a reply fits into the 4yz (i.e. transient) or the 5yz (i.e. permanent) category (see below)
is that replies are 4yz if they can be successful if repeated
without any change in command form or in properties of the sender
or receiver (that is, the command is repeated identically and the
receiver does not put up a new implementation.)"

(I added transient/permanent explanatory notes). I think that important is "change in command form or in properties of the sender or receiver", I think that increasing mailbox or deleting its content is changing properties of the receiver. Moreover most of the mailbox full or similar errors I have ever seen on the internet were 5yz, i.e. permanent. Also I don't think it's good idea to change this for stable product.

Comment 4 Paul Wouters 2016-10-25 15:11:01 UTC

The important thing is that we solve the problem of mail being lost. Regardless of where in the mail processing stack we do that, the end result should be that if the mailbox is full, the sender should receive an error back so it knows the mail has not been delivered.

I had to deal with this issue on and off for two years before figuring out what was going on and why some people often had their mail lost when sending it to me. They just happened to send me PDFs a lot which would trigger the incoming project folder to hit the maximum

Comment 5 Jaroslav Škarvada 2016-10-25 16:15:33 UTC

(In reply to Paul Wouters from comment #4)
> The important thing is that we solve the problem of mail being lost.
> Regardless of where in the mail processing stack we do that, the end result
> should be that if the mailbox is full, the sender should receive an error
> back so it knows the mail has not been delivered.
> 
> I had to deal with this issue on and off for two years before figuring out
> what was going on and why some people often had their mail lost when sending
> it to me. They just happened to send me PDFs a lot which would trigger the
> incoming project folder to hit the maximum

I will take a look, this maybe bug. It should bounce something sensible if the message is undeliverable.

Comment 8 Jaroslav Škarvada 2018-11-16 15:55:31 UTC

Sorry for delay, I tried to reproduce the problem, but it works for me:
# rpm -q postfix
postfix-2.10.1-6.el7.x86_64
# rpm -q procmail
procmail-3.22-36.el7_4.1.x86_64
# postconf mailbox_command=/usr/bin/procmail
# systemctl restart postfix
# ll /var/spool/mail/testmail
-rw-rw----. 1 testmail mail 100000000 Nov 16 16:41 /var/spool/mail/testmail
# ll /var/spool/mail/yarda
-rw-rw----. 1 yarda mail 3115 Nov 16 16:41 /var/spool/mail/yarda
# telnet localhost 25
# Trying ::1...
Connected to localhost.
Escape character is '^]'.
220 <HOST> ESMTP Postfix
ehlo localhost
250-<HOST>
250-PIPELINING
250-SIZE 10240000
250-VRFY
250-ETRN
250-ENHANCEDSTATUSCODES
250-8BITMIME
250 DSN
mail from: yarda
250 2.1.0 Ok
rcpt to: testmail
250 2.1.5 Ok
data
354 End data with <CR><LF>.<CR><LF>
hi
.
250 2.0.0 Ok: queued as 41F54235679F
quit
221 2.0.0 Bye
Connection closed by foreign host.
# cat /var/spool/mail/yarda
...
Original-Recipient: rfc822;testmail
Action: failed
Status: 5.2.0
Diagnostic-Code: x-unix; procmail: Error while writing to
    "/var/spool/mail/testmail"
...

So it seems to work, it bounced back 5.2.0 error. Well, 5.2.2 would be nicer, but 5.2.0 is also OK.

Btw regarding the 4/5 error code, sendmail is also returning 5 for this type of errors.

Comment 9 Tomáš Hozza 2019-12-06 16:09:22 UTC

Red Hat Enterprise Linux version 7 entered the Maintenance Support 1 Phase in August 2019. In this phase only qualified Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released as they become available. Other errata advisories may be delivered as appropriate.

This bug has been reviewed by Support and Engineering representative and does not meet the inclusion criteria for Maintenance Support 1 Phase. If this issue still exists in newer major version of Red Hat Enterprise Linux, it has been cloned there and work will continue in the cloned bug.

For more information about Red Hat Enterprise Linux Lifecycle, please see https://access.redhat.com/support/policy/updates/errata/

Comment 10 RHEL Program Management 2019-12-06 16:09:28 UTC

Development Management has reviewed and declined this request. You may appeal this decision by using your Red Hat support channels, who will make certain  the issue receives the proper prioritization with product and development management.

https://www.redhat.com/support/process/production/#howto