Bug 1421736

Summary: SEGV in pdkim_finish_bodyhash, pdkim.c:817
Product: [Fedora] Fedora EPEL Reporter: Michael Weisbach <mwei-redhatbugzilla-2017>
Component: eximAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: el6CC: arpad.kunszt, chref, dwmw2, fschwarz, jeharris, jskarvad, tremble
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: FreeBSD   
URL: https://bugs.exim.org/show_bug.cgi?id=2029
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-30 15:02:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
add debian's patch to EPEL
none
Add test for sig->bodyhash.data=NULL before access none

Description Michael Weisbach 2017-02-13 14:52:37 UTC
Description of problem:
Program terminated with signal 11, Segmentation fault.

Version-Release number of selected component (if applicable):
exim-4.88-2.el6.x86_64
exim-debuginfo-4.88-2.el6.x86_64
exim-doc-4.69-4.el6.noarch
exim-greylist-4.88-2.el6.x86_64
on CentOS 6.8 (x86_64) + EPEL w/ latest updates / fixes applied.

How reproducible:
After upgrade to 4.88-2 I observing SEGVs a lot.

Steps to Reproduce:
unknown, but seems DKIM related

Actual results:
(gdb) output as follows
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `/usr/sbin/exim -bd -q1h'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f6367f7c175 in pdkim_finish_bodyhash (ctx=0x7f636896abd0, return_signatures=0x7f63681c8798) at pdkim.c:817
817	    if (memcmp(bh.data, sig->bodyhash.data, bh.len) == 0)

Expected results:
No SEGVs :-)

Additional info:
(gdb) backtrace 
#0  0x00007f6367f7c175 in pdkim_finish_bodyhash (ctx=0x7f636896abd0, return_signatures=0x7f63681c8798) at pdkim.c:817
#1  pdkim_feed_finish (ctx=0x7f636896abd0, return_signatures=0x7f63681c8798) at pdkim.c:1332
#2  0x00007f6367eff085 in dkim_exim_verify_finish () at dkim.c:146
#3  0x00007f6367f20157 in receive_msg (extract_recip=0) at receive.c:3272
#4  0x00007f6367ed11e1 in handle_smtp_call () at daemon.c:509
#5  daemon_go () at daemon.c:2040
#6  0x00007f6367ee8ffa in main (argc=3, cargv=0x7ffd88488f18) at exim.c:4808

Comment 1 Felix Schwarz 2017-02-13 15:08:20 UTC
Thank your for reporting this I didn't get to do that even though I'm seeing this too on CentOS 7 (with exim 4.88 as well).

It seems as if this is related to specific messages as I'm seeing this only in some exim instances (and sometimes the instances stop crashing with might be attributed to the sending mailserver giving up).

Comment 2 Michael Weisbach 2017-02-13 16:10:52 UTC
812	
813	  /* VERIFICATION --------------------------------------------------------- */
814	  else
815	    {
816	    /* Compare bodyhash */
817	    if (memcmp(bh.data, sig->bodyhash.data, bh.len) == 0)
818	      {
819	      DEBUG(D_acl) debug_printf("PDKIM [%s] Body hash verified OK\n", sig->domain);
820	      }
821	    else

(gdb) print sig
$1 = (pdkim_signature *) 0x7fb944a3fc60

(gdb) print sig->bodyhash
$2 = {data = 0x0, len = 0}

... looks like sig->bodyhash.data = NULL. Maybe we need some more safety check before memcmp :-) and use of sig->bodyhash.data.

Comment 3 Christof Efkemann 2017-02-13 16:13:12 UTC
I guess the problem is already known (and fixed) upstream:
https://bugs.exim.org/show_bug.cgi?id=2029

Comment 4 Michael Weisbach 2017-02-13 16:28:14 UTC
Hmmm, looks like Jaroslav applied a DKIM related patch to 4.88-3 for el7,

https://src.fedoraproject.org/cgit/rpms/exim.git/diff/exim-4.88-DKIM-fix.patch?id=5ddc95e67a111174d0b8f4b969af1b9ec672495e

Can someone trigger a build of 4.88-3 for el6 as well please, so we can see whether the issue still persist after this DKIM patch was applied in el6 as well?

Thanks!

Comment 5 Christof Efkemann 2017-02-13 16:35:26 UTC
I don't think a rebuild of 4.88-3 will help.
The DKIM patch in 4.88-3 is older than the patch mentioned in Exim Bugzilla issue 2029, and it doesn't contain anything related to bodyhash.data.

Also, I observed a lot of segmentation faults with 4.88-3 on el7 (I wasn't able to check, but I guess it's the same problem that you have on el6).

Comment 6 Michael Weisbach 2017-02-13 16:47:43 UTC
Christof, hmm, yes - agree. I read "DKIM patch" too and couldn't find "bodyhash.data" related changes. So we need to incorporate 2029'fix and build new exim packages for el6 + el7, right?

Comment 7 Christof Efkemann 2017-02-13 17:08:33 UTC
I agree.

Comment 8 Michael Weisbach 2017-02-13 18:58:39 UTC
Just a hint, as also mentioned in Exim's 2029 bug report: "These crashes will be leaving orphan -D files in the spool; you'll need to manually clean them out.". Watch out for -D files filling up your Exim's input/ directory.

To workaround the issue and prevent SEGVs in 4.88-{2,3}, you might consider to temporary disable DKIM checks (dkim_disable_verify) until a fixed 4.88-4 or 4.89 becomes available.

@Jaroslav, 
I'm not so familiar with Exim for EPEL (el6, el7) build & release strategy :). The bugfix mentioned by Christof in Exim's 2029 bug report is planned for 4.89 milestone. If we can include this tiny patch in a 4.88-4 build for el6+el7, would be great.

Comment 9 Felix Schwarz 2017-02-14 08:29:46 UTC
(In reply to Christof Efkemann from comment #5)
> The DKIM patch in 4.88-3 is older than the patch mentioned in Exim Bugzilla
> issue 2029, and it doesn't contain anything related to bodyhash.data.

The DKIM patch in EPEL is for Exim's bug 2016 (https://bugs.exim.org/show_bug.cgi?id=2016) and is based on this upstream commit: https://git.exim.org/exim.git/patch/bd8fbe3606d80e5a3fc02fe71b521146c6938448

(In reply to Michael Weisbach from comment #8)
> The bugfix mentioned by Christof in Exim's 2029 bug report is planned
> for 4.89 milestone. If we can include this tiny patch in a 4.88-4 build for
> el6+el7, would be great.

Unfortunately the fix for bug 2029 is not "tiny" and the patch does not apply cleanly to EPEL's patched 4.88.

At least in my case the problem is connected to the use of Exim's chunking implementation (SMTP BDAT extension). I can prevent crashes by setting:

  chunking_advertise_hosts =

Also Debian ships a patch to disable chunking by default: https://lists.exim.org/lurker/message/20170119.144417.13a576ec.en.html

Patch filename is "78_Disable-chunking-BDAT-by-default.patch" in Debian's exim 4.89RC5 (https://packages.debian.org/unstable/exim4).

Basically Debian's patch changes the default of "chunking_advertise_hosts" to "unset". I'd like to recommend doing this for EPEL as well. Clearly the chunking feature is not really stable so we should not enable it by default. The patch seems to be pretty simple.

Comment 10 Michael Weisbach 2017-02-14 08:48:35 UTC
Hi Felix,

"tiny" patch I was referring to was 

  https://bugs.exim.org/attachment.cgi?id=973. 

This just adds a safety check for sig->bodyhash.data = NULL, to prevent SEGV here. But I agree, we need a more general solution from Exim team in 4.89++ eventually.

Thanks for pointing us to chunking_advertise_hosts as a solution to workaround. Will try it.

Comment 11 Felix Schwarz 2017-02-14 08:49:59 UTC
Created attachment 1250147 [details]
add debian's patch to EPEL

The attached patch fixes the issue for me by disabling BDAT chunking by default. Users can do that also manually by adding the option in their exim.conf but I think the code is not ready to be enabled by default.

I kicked off a scratch build (https://koji.fedoraproject.org/koji/taskinfo?taskID=17852711) so others can try the update easily.

Please note I don't have any special permissions for the Fedora exim package so I can't push this myself.

Comment 12 Felix Schwarz 2017-02-14 08:57:17 UTC
(In reply to Michael Weisbach from comment #10)
> "tiny" patch I was referring to was 
> 
>   https://bugs.exim.org/attachment.cgi?id=973. 
> 
> This just adds a safety check for sig->bodyhash.data = NULL, to prevent SEGV
> here. 

That was just the initial patch while the actual fix is more involved:

 src/src/pdkim/pdkim.c       | 57 +++++++++++++++++++++------------------------
 test/log/4506               |  5 +++-
 test/scripts/4500-DKIM/4506 |  2 +-
 3 files changed, 31 insertions(+), 33 deletions(-)

https://git.exim.org/exim.git/patch/02c4f8fb8927c97939d3daa345148739e275dc8d

I'm not familiar with Exim's DKIM code so I don't feel comfortable in creating custom patches just for EPEL. Instead of manually patching all the DKIM problems (I see even more DKIM-related commits in Exim's git) and creating a Frankstein-version we should just disable chunking by default.

Comment 13 Michael Weisbach 2017-02-14 08:58:24 UTC
Created attachment 1250150 [details]
Add test for sig->bodyhash.data=NULL before access

Interim patch, as proposed in Exim bug 2029, see https://bugs.exim.org/attachment.cgi?id=973. This is just to prevent SEGV here.

As Felix already mentioned, the entire fix of this Exim bug depends on a larger patch set going into 4.89, we hope.

Comment 14 Michael Weisbach 2017-02-14 09:00:07 UTC
Felix, I'm fine with disable chunking :-). I do not like Frankenstein much.

Comment 15 Felix Schwarz 2017-02-15 08:16:14 UTC
The segfaults on our exim servers are gone both with my updated RPM pakcage (see comment 11) as well as manually setting "chunking_advertise_hosts = ".

Comment 16 Michael Weisbach 2017-02-15 17:04:17 UTC
Hi Felix, 
can you do me a favour and kick off a scratch build for el6 too? Thanks a lot!

Comment 17 Felix Schwarz 2017-02-19 16:40:55 UTC
(In reply to Michael Weisbach from comment #16)
> can you do me a favour and kick off a scratch build for el6 too? Thanks a
> lot!

Here you go: https://koji.fedoraproject.org/koji/taskinfo?taskID=17956115 (untested but exactly the same code as I'm using on EPEL7).

Comment 18 Kunszt Arpad 2017-03-06 07:48:46 UTC
I tested Felix's build on CentOS 6. Since I upgraded the Exim there wasn't any new segfaults. I can't say for sure that the issue is gone but earlier I got at least one segfault during a period this long.

Is it possible to push this build to the EPEL repo?

Comment 19 Ben Cotton 2020-11-05 16:46:29 UTC
This message is a reminder that EPEL 6 is nearing its end of life. Fedora will stop maintaining and issuing updates for EPEL 6 on 2020-11-30. It is our policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of 'el6'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later EPEL version.

Thank you for reporting this issue and we are sorry that we were not able to fix it before EPEL 6 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged  change the 'version' to a later Fedora version prior this bug is closed as described in the policy above.

Comment 20 Ben Cotton 2020-11-30 15:02:20 UTC
EPEL el6 changed to end-of-life (EOL) status on 2020-11-30. EPEL el6 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
EPEL please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.