Description of problem: Program terminated with signal 11, Segmentation fault. Version-Release number of selected component (if applicable): exim-4.88-2.el6.x86_64 exim-debuginfo-4.88-2.el6.x86_64 exim-doc-4.69-4.el6.noarch exim-greylist-4.88-2.el6.x86_64 on CentOS 6.8 (x86_64) + EPEL w/ latest updates / fixes applied. How reproducible: After upgrade to 4.88-2 I observing SEGVs a lot. Steps to Reproduce: unknown, but seems DKIM related Actual results: (gdb) output as follows Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Core was generated by `/usr/sbin/exim -bd -q1h'. Program terminated with signal 11, Segmentation fault. #0 0x00007f6367f7c175 in pdkim_finish_bodyhash (ctx=0x7f636896abd0, return_signatures=0x7f63681c8798) at pdkim.c:817 817 if (memcmp(bh.data, sig->bodyhash.data, bh.len) == 0) Expected results: No SEGVs :-) Additional info: (gdb) backtrace #0 0x00007f6367f7c175 in pdkim_finish_bodyhash (ctx=0x7f636896abd0, return_signatures=0x7f63681c8798) at pdkim.c:817 #1 pdkim_feed_finish (ctx=0x7f636896abd0, return_signatures=0x7f63681c8798) at pdkim.c:1332 #2 0x00007f6367eff085 in dkim_exim_verify_finish () at dkim.c:146 #3 0x00007f6367f20157 in receive_msg (extract_recip=0) at receive.c:3272 #4 0x00007f6367ed11e1 in handle_smtp_call () at daemon.c:509 #5 daemon_go () at daemon.c:2040 #6 0x00007f6367ee8ffa in main (argc=3, cargv=0x7ffd88488f18) at exim.c:4808
Thank your for reporting this I didn't get to do that even though I'm seeing this too on CentOS 7 (with exim 4.88 as well). It seems as if this is related to specific messages as I'm seeing this only in some exim instances (and sometimes the instances stop crashing with might be attributed to the sending mailserver giving up).
812 813 /* VERIFICATION --------------------------------------------------------- */ 814 else 815 { 816 /* Compare bodyhash */ 817 if (memcmp(bh.data, sig->bodyhash.data, bh.len) == 0) 818 { 819 DEBUG(D_acl) debug_printf("PDKIM [%s] Body hash verified OK\n", sig->domain); 820 } 821 else (gdb) print sig $1 = (pdkim_signature *) 0x7fb944a3fc60 (gdb) print sig->bodyhash $2 = {data = 0x0, len = 0} ... looks like sig->bodyhash.data = NULL. Maybe we need some more safety check before memcmp :-) and use of sig->bodyhash.data.
I guess the problem is already known (and fixed) upstream: https://bugs.exim.org/show_bug.cgi?id=2029
Hmmm, looks like Jaroslav applied a DKIM related patch to 4.88-3 for el7, https://src.fedoraproject.org/cgit/rpms/exim.git/diff/exim-4.88-DKIM-fix.patch?id=5ddc95e67a111174d0b8f4b969af1b9ec672495e Can someone trigger a build of 4.88-3 for el6 as well please, so we can see whether the issue still persist after this DKIM patch was applied in el6 as well? Thanks!
I don't think a rebuild of 4.88-3 will help. The DKIM patch in 4.88-3 is older than the patch mentioned in Exim Bugzilla issue 2029, and it doesn't contain anything related to bodyhash.data. Also, I observed a lot of segmentation faults with 4.88-3 on el7 (I wasn't able to check, but I guess it's the same problem that you have on el6).
Christof, hmm, yes - agree. I read "DKIM patch" too and couldn't find "bodyhash.data" related changes. So we need to incorporate 2029'fix and build new exim packages for el6 + el7, right?
I agree.
Just a hint, as also mentioned in Exim's 2029 bug report: "These crashes will be leaving orphan -D files in the spool; you'll need to manually clean them out.". Watch out for -D files filling up your Exim's input/ directory. To workaround the issue and prevent SEGVs in 4.88-{2,3}, you might consider to temporary disable DKIM checks (dkim_disable_verify) until a fixed 4.88-4 or 4.89 becomes available. @Jaroslav, I'm not so familiar with Exim for EPEL (el6, el7) build & release strategy :). The bugfix mentioned by Christof in Exim's 2029 bug report is planned for 4.89 milestone. If we can include this tiny patch in a 4.88-4 build for el6+el7, would be great.
(In reply to Christof Efkemann from comment #5) > The DKIM patch in 4.88-3 is older than the patch mentioned in Exim Bugzilla > issue 2029, and it doesn't contain anything related to bodyhash.data. The DKIM patch in EPEL is for Exim's bug 2016 (https://bugs.exim.org/show_bug.cgi?id=2016) and is based on this upstream commit: https://git.exim.org/exim.git/patch/bd8fbe3606d80e5a3fc02fe71b521146c6938448 (In reply to Michael Weisbach from comment #8) > The bugfix mentioned by Christof in Exim's 2029 bug report is planned > for 4.89 milestone. If we can include this tiny patch in a 4.88-4 build for > el6+el7, would be great. Unfortunately the fix for bug 2029 is not "tiny" and the patch does not apply cleanly to EPEL's patched 4.88. At least in my case the problem is connected to the use of Exim's chunking implementation (SMTP BDAT extension). I can prevent crashes by setting: chunking_advertise_hosts = Also Debian ships a patch to disable chunking by default: https://lists.exim.org/lurker/message/20170119.144417.13a576ec.en.html Patch filename is "78_Disable-chunking-BDAT-by-default.patch" in Debian's exim 4.89RC5 (https://packages.debian.org/unstable/exim4). Basically Debian's patch changes the default of "chunking_advertise_hosts" to "unset". I'd like to recommend doing this for EPEL as well. Clearly the chunking feature is not really stable so we should not enable it by default. The patch seems to be pretty simple.
Hi Felix, "tiny" patch I was referring to was https://bugs.exim.org/attachment.cgi?id=973. This just adds a safety check for sig->bodyhash.data = NULL, to prevent SEGV here. But I agree, we need a more general solution from Exim team in 4.89++ eventually. Thanks for pointing us to chunking_advertise_hosts as a solution to workaround. Will try it.
Created attachment 1250147 [details] add debian's patch to EPEL The attached patch fixes the issue for me by disabling BDAT chunking by default. Users can do that also manually by adding the option in their exim.conf but I think the code is not ready to be enabled by default. I kicked off a scratch build (https://koji.fedoraproject.org/koji/taskinfo?taskID=17852711) so others can try the update easily. Please note I don't have any special permissions for the Fedora exim package so I can't push this myself.
(In reply to Michael Weisbach from comment #10) > "tiny" patch I was referring to was > > https://bugs.exim.org/attachment.cgi?id=973. > > This just adds a safety check for sig->bodyhash.data = NULL, to prevent SEGV > here. That was just the initial patch while the actual fix is more involved: src/src/pdkim/pdkim.c | 57 +++++++++++++++++++++------------------------ test/log/4506 | 5 +++- test/scripts/4500-DKIM/4506 | 2 +- 3 files changed, 31 insertions(+), 33 deletions(-) https://git.exim.org/exim.git/patch/02c4f8fb8927c97939d3daa345148739e275dc8d I'm not familiar with Exim's DKIM code so I don't feel comfortable in creating custom patches just for EPEL. Instead of manually patching all the DKIM problems (I see even more DKIM-related commits in Exim's git) and creating a Frankstein-version we should just disable chunking by default.
Created attachment 1250150 [details] Add test for sig->bodyhash.data=NULL before access Interim patch, as proposed in Exim bug 2029, see https://bugs.exim.org/attachment.cgi?id=973. This is just to prevent SEGV here. As Felix already mentioned, the entire fix of this Exim bug depends on a larger patch set going into 4.89, we hope.
Felix, I'm fine with disable chunking :-). I do not like Frankenstein much.
The segfaults on our exim servers are gone both with my updated RPM pakcage (see comment 11) as well as manually setting "chunking_advertise_hosts = ".
Hi Felix, can you do me a favour and kick off a scratch build for el6 too? Thanks a lot!
(In reply to Michael Weisbach from comment #16) > can you do me a favour and kick off a scratch build for el6 too? Thanks a > lot! Here you go: https://koji.fedoraproject.org/koji/taskinfo?taskID=17956115 (untested but exactly the same code as I'm using on EPEL7).
I tested Felix's build on CentOS 6. Since I upgraded the Exim there wasn't any new segfaults. I can't say for sure that the issue is gone but earlier I got at least one segfault during a period this long. Is it possible to push this build to the EPEL repo?
This message is a reminder that EPEL 6 is nearing its end of life. Fedora will stop maintaining and issuing updates for EPEL 6 on 2020-11-30. It is our policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of 'el6'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later EPEL version. Thank you for reporting this issue and we are sorry that we were not able to fix it before EPEL 6 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above.
EPEL el6 changed to end-of-life (EOL) status on 2020-11-30. EPEL el6 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of EPEL please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.