Bug 740895

Summary:	mailx doesn't return exitcode != 0 on the error
Product:	[Fedora] Fedora	Reporter:	Denys Vlasenko <dvlasenk>
Component:	exim	Assignee:	David Woodhouse <dwmw2>
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	15	CC:	dmitry, dwmw2, jskarvad, mlichvar, pschiffe
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-08-07 19:50:52 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Denys Vlasenko 2011-09-23 17:06:46 UTC

While debugging my application, I ran:

echo "testdata" | /bin/mailx -s Test -r root@localhost root@localhost; echo $?

which returns exit code zero.

Then I straced mailx and found out that it spawns /usr/sbin/sendmail as a child, *and parent exits*!

Which means, regardless of what child does, exitcode is always 0.

This is wrong. Mailx should let its caller know whether it was successful.

Here is the strace log fragment:

...
10292 18:33:12.826127 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb77d6798) = 10293
10292 18:33:12.826370 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10292 18:33:12.826520 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10292 18:33:12.826621 _llseek(5, 0, [0], SEEK_SET) = 0
10292 18:33:12.826696 close(5)          = 0
10292 18:33:12.826764 munmap(0xb77e7000, 4096) = 0
10292 18:33:12.827001 exit_group(0)     = ?

10293 18:33:12.826349 set_robust_list(0xb77d67a0, 0xc) = 0
10293 18:33:12.826501 munmap(0xb77e9000, 4096) = 0
10293 18:33:12.826579 open("/dev/null", O_RDONLY) = 3
10293 18:33:12.826678 dup3(3, 0, 0)     = 0
10293 18:33:12.826747 close(3)          = 0
10293 18:33:12.826816 dup2(5, 0)        = 0
10293 18:33:12.827021 rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTART}, {SIG_DFL, [], SA_RESTART}, 8) = 0
10293 18:33:12.827224 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTART}, {SIG_DFL, [], SA_RESTART}, 8) = 0
10293 18:33:12.827347 rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
10293 18:33:12.827468 rt_sigaction(SIGTSTP, {SIG_IGN, [], SA_RESTART}, {SIG_DFL, [], SA_RESTART}, 8) = 0
10293 18:33:12.827602 rt_sigaction(SIGTTIN, {SIG_IGN, [], SA_RESTART}, {SIG_DFL, [], SA_RESTART}, 8) = 0
10293 18:33:12.827733 rt_sigaction(SIGTTOU, {SIG_IGN, [], SA_RESTART}, {SIG_DFL, [], SA_RESTART}, 8) = 0
10293 18:33:12.827890 rt_sigprocmask(SIG_UNBLOCK, ~[RTMIN RT_1], NULL, 8) = 0
10293 18:33:12.827999 execve("/usr/sbin/sendmail", ["send-mail", "-i", "-r", "root@localhost", "root@localhost"], [/* 51 vars */]) = 0
...


Apart from correctness perspective wrt exit code, I don't see why mailx needs to create a *child*. Can't it just exec /usr/sbin/sendmail, without forking?

Comment 1 Dmitry Butskoy 2011-09-23 17:22:47 UTC

Forwarded to upstream.

You can contact the author as well, at http://heirloom.sourceforge.net/mailx.html

Comment 2 Dmitry Butskoy 2011-09-26 12:54:49 UTC

For work with exit code, use "sendwait" option.
See mail(1) for more info.

Comment 3 Denys Vlasenko 2011-09-29 17:19:32 UTC

Thanks! Added "-S sendwait"...

and I see that mailx now waits for the child to finish, but the child (sendmail) itself does the same thing: forks yet another child, and *doesn't wait* for it!

On F15, /usr/bin/sendmail is a symlink to exim.

Reopening and reassigning to exim.



Bug description for exim:

When sendmail (which is symlinked to exim on my F15 machine) is invoked like this:

sendmail -i -r root@localhost root@localhost <email.txt

it exits with exitcode 0 despite the fact that delivery to root@localhost is prohibited and thus fails. This means that user is left with false impression that delivery was successful, even though in this case we know for sure it wasn't!

In strace log I see that it forks a child, and parent exits at once:

...
...
19209 18:56:39.949400 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7843798) = 19210
19209 18:56:39.950899 exit_group(0)     = ?
^^^^^^^^^^^^^^^^^^^^^^^^^^^
parent exited

19210 18:56:39.950013 set_robust_list(0xb78437a0, 0xc) = 0
19210 18:56:39.950213 close(0)          = 0
19210 18:56:39.950326 close(1)          = 0
19210 18:56:39.950437 close(2)          = 0
19210 18:56:39.950695 setsid()          = 19210
19210 18:56:39.950849 fstat64(0, 0xbf86ecc0) = -1 EBADF (Bad file descriptor)
19210 18:56:39.951524 open("/dev/null", O_RDWR|O_LARGEFILE) = 0
19210 18:56:39.951757 fstat64(1, 0xbf86ecc0) = -1 EBADF (Bad file 19210 18:56:39.952028 dup2(0, 1)              = 1
19210 18:56:39.952246 fstat64(2, 0xbf86ecc0) = -1 EBADF (Bad file descriptor)
19210 18:56:39.952477 dup2(0, 2)        = 2
19210 18:56:39.952641 geteuid32()       = 93
19210 18:56:39.952830 fstat64(0, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
19210 18:56:39.953091 fstat64(1, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
19210 18:56:39.953343 fstat64(2, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
19210 18:56:39.953910 execve("/usr/sbin/exim", ["/usr/sbin/exim", "-Mc", "1R9JuZ-0004zp-LW"], [/* 52 vars */]) = 0


Can you wait for the child to finish and report its exit code?
Why do you fork a child at all?

Comment 4 David Woodhouse 2011-09-29 21:22:10 UTC

We need to fork a child because we drop root privs whenever we can, and will re-exec in order to gain them again when it becomes apparent that a local delivery is necessary.

What if /root/.forward exists, and expands to multiple remote destinations? And some of them use greylisting? You want to wait until all the deliveries are complete?

That isn't how mail works. You send an email, and if it doesn't get through then you get a bounce telling you so. The fact that you managed to submit your mail to the local mailer tells you *nothing*.

In fact, although a capable MTA like Exim is perfectly capable of doing a lot of vetting at SMTP time and rejecting messages as they're being submitted, it's usually configured *not* to do that, since so many MUAs can't cope with it. The sanest thing to do for an *authenticated* connection is accept-and-bounce.

(Note: this is in stark contrast to SMTP incoming from the network, where you SHOULD NOT accept-and-bounce)

Comment 5 Denys Vlasenko 2011-09-30 17:11:36 UTC

(In reply to comment #4)
> What if /root/.forward exists, and expands to multiple remote destinations? And
> some of them use greylisting? You want to wait until all the deliveries are
> complete?

I don't ask for "if exit code is 0, then delivery was definitely successful" behavior.

I am asking for "if there is a easily-detectable fatal delivery problem, exit with nonzero exit code". This is the case for attempts to delivery to root@localhost in standard F15 installation.

I hope you see that these two requests are not equivalent. For one, "if exit code is 0, then delivery was definitely successful" requires confirmation of delivery success across network - clearly, not a reasonable thing to implement.

However, when exim determines in microseconds, by checking only local configuration files, that delivery is prohibited, then it is not that hard to let caller know that. Do you disagree?

With current behavior, I was motivated to create this bz when I saw the following:

    Sending an email...
    Email was sent to: root@localhost

which, as it turned out, meant only "'mailx -r root@localhost root@localhost' exited with exit code 0, we have no idea whether local root user got your mail or not. Most likely he didn't, as standard F15 exim config wouldn't allow it" - quite an appalling level of stupidity for the software distributed in the year 2011, I think...

Comment 6 Fedora End Of Life 2012-08-07 19:50:54 UTC

This message is a notice that Fedora 15 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 15. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. At this time, all open bugs with a Fedora 'version'
of '15' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we were unable to fix it before Fedora 15 reached end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged to click on
"Clone This Bug" (top right of this page) and open it against that
version of Fedora.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping