1275078 – postfix mbox default size of 50MB causes mail loss without clear logging

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1275078 - postfix mbox default size of 50MB causes mail loss without clear logging

Summary: postfix mbox default size of 50MB causes mail loss without clear logging

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	postfix
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Jaroslav Škarvada
QA Contact:	qe-baseos-daemons
Docs Contact:
URL:
Whiteboard:
Depends On:	1275076
Blocks:	1400961 1472751
TreeView+	depends on / blocked

Reported:	2015-10-25 15:43 UTC by Paul Wouters
Modified:	2019-12-06 16:09 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1275076
Environment:
Last Closed:	2019-12-06 16:09:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Paul Wouters 2015-10-25 15:43:07 UTC

+++ This bug was initially created as a clone of Bug #1275076 +++

Description of problem:

When using mbox, postfix per default limits files in /var/mail to 50MB. procmail as the LDA is prevented from writing, and returns EX_CANTCREAT which causes the mail to fail with a permanent 5.x.x error. It is lost and never retried for delivery. This all happened while my /var/mail had 300G free disk space.

Either the default maximum should be significantly increased (eg 500) or should take into account the number of users and the amount of free spool disk space. Or should be disabled completely per default. The latter can be done using mailbox_size_limit=0

This was a bug that took me 6 months of hunting down because of the bad logging between procmail and postfix with no clue as to the reason why procmail dies with "can't create user output file."

I've filed a bug for procmail to change its return value to a temporary failure code - See #1275071 and #1275072

At a minimum I think we need to bump the 50Mb significantly to avoid breaking simple small postfix installs with a handful of users and a modern 1T disk.

Comment 3 Jaroslav Škarvada 2016-10-25 12:51:28 UTC

Regarding the logging I agree that it should be improved, but I think we shouldn't divert from the upstream defaults especially for the stable product.

I also think that 'mailbox limit reached' is permanent error, not transient (like DNS failure, connectivity error, etc.). It's fatal error that will not just go away without admin/mailbox owner intervention on the SMTP server. There are various opinions about it, but according to RFC2821:

"It is difficult to assign a meaning to "transient" when two different sites (receiver- andsender-SMTP agents) must agree on the interpretation.  Each reply
in this category might have a different time value, but the SMTP
client is encouraged to try again.  A rule of thumb to determine
whether a reply fits into the 4yz (i.e. transient) or the 5yz (i.e. permanent) category (see below)
is that replies are 4yz if they can be successful if repeated
without any change in command form or in properties of the sender
or receiver (that is, the command is repeated identically and the
receiver does not put up a new implementation.)"

(I added transient/permanent explanatory notes). I think that important is "change in command form or in properties of the sender or receiver", I think that increasing mailbox or deleting its content is changing properties of the receiver. Moreover most of the mailbox full or similar errors I have ever seen on the internet were 5yz, i.e. permanent. Also I don't think it's good idea to change this for stable product.

Comment 4 Paul Wouters 2016-10-25 15:11:01 UTC

The important thing is that we solve the problem of mail being lost. Regardless of where in the mail processing stack we do that, the end result should be that if the mailbox is full, the sender should receive an error back so it knows the mail has not been delivered.

I had to deal with this issue on and off for two years before figuring out what was going on and why some people often had their mail lost when sending it to me. They just happened to send me PDFs a lot which would trigger the incoming project folder to hit the maximum

Comment 5 Jaroslav Škarvada 2016-10-25 16:15:33 UTC

(In reply to Paul Wouters from comment #4)
> The important thing is that we solve the problem of mail being lost.
> Regardless of where in the mail processing stack we do that, the end result
> should be that if the mailbox is full, the sender should receive an error
> back so it knows the mail has not been delivered.
> 
> I had to deal with this issue on and off for two years before figuring out
> what was going on and why some people often had their mail lost when sending
> it to me. They just happened to send me PDFs a lot which would trigger the
> incoming project folder to hit the maximum

I will take a look, this maybe bug. It should bounce something sensible if the message is undeliverable.

Comment 8 Jaroslav Škarvada 2018-11-16 15:55:31 UTC

Sorry for delay, I tried to reproduce the problem, but it works for me:
# rpm -q postfix
postfix-2.10.1-6.el7.x86_64
# rpm -q procmail
procmail-3.22-36.el7_4.1.x86_64
# postconf mailbox_command=/usr/bin/procmail
# systemctl restart postfix
# ll /var/spool/mail/testmail
-rw-rw----. 1 testmail mail 100000000 Nov 16 16:41 /var/spool/mail/testmail
# ll /var/spool/mail/yarda
-rw-rw----. 1 yarda mail 3115 Nov 16 16:41 /var/spool/mail/yarda
# telnet localhost 25
# Trying ::1...
Connected to localhost.
Escape character is '^]'.
220 <HOST> ESMTP Postfix
ehlo localhost
250-<HOST>
250-PIPELINING
250-SIZE 10240000
250-VRFY
250-ETRN
250-ENHANCEDSTATUSCODES
250-8BITMIME
250 DSN
mail from: yarda
250 2.1.0 Ok
rcpt to: testmail
250 2.1.5 Ok
data
354 End data with <CR><LF>.<CR><LF>
hi
.
250 2.0.0 Ok: queued as 41F54235679F
quit
221 2.0.0 Bye
Connection closed by foreign host.
# cat /var/spool/mail/yarda
...
Original-Recipient: rfc822;testmail
Action: failed
Status: 5.2.0
Diagnostic-Code: x-unix; procmail: Error while writing to
    "/var/spool/mail/testmail"
...

So it seems to work, it bounced back 5.2.0 error. Well, 5.2.2 would be nicer, but 5.2.0 is also OK.

Btw regarding the 4/5 error code, sendmail is also returning 5 for this type of errors.

Comment 9 Tomáš Hozza 2019-12-06 16:09:22 UTC

Red Hat Enterprise Linux version 7 entered the Maintenance Support 1 Phase in August 2019. In this phase only qualified Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released as they become available. Other errata advisories may be delivered as appropriate.

This bug has been reviewed by Support and Engineering representative and does not meet the inclusion criteria for Maintenance Support 1 Phase. If this issue still exists in newer major version of Red Hat Enterprise Linux, it has been cloned there and work will continue in the cloned bug.

For more information about Red Hat Enterprise Linux Lifecycle, please see https://access.redhat.com/support/policy/updates/errata/

Comment 10 RHEL Program Management 2019-12-06 16:09:28 UTC

Development Management has reviewed and declined this request. You may appeal this decision by using your Red Hat support channels, who will make certain  the issue receives the proper prioritization with product and development management.

https://www.redhat.com/support/process/production/#howto

Note You need to log in before you can comment on or make changes to this bug.