Unbound 1.5.0-1.fc22 seems to be crashing a lot here. I can't seem to see any pattern to the crashes, and sadly, abrt doesn't seem to pick up on them. :( Nov 21 13:35:09 voldemort.scrye.com systemd[1]: unbound.service: main process exited, code=killed, status=11/SEGV Nov 21 13:35:09 voldemort.scrye.com systemd[1]: Unit unbound.service entered failed state. Nov 21 13:35:09 voldemort.scrye.com systemd[1]: unbound.service failed. Nov 21 13:38:34 voldemort.scrye.com runuser[4855]: pam_unix(runuser:session): session opened for user unbound by (uid=0) Nov 21 13:38:34 voldemort.scrye.com runuser[4855]: pam_unix(runuser:session): session closed for user unbound Nov 21 13:38:34 voldemort.scrye.com unbound-checkconf[4858]: unbound-checkconf: no errors in /etc/unbound/unbound.conf Nov 21 13:38:34 voldemort.scrye.com unbound[4861]: Nov 21 13:38:34 unbound[4861:0] warning: increased limit(open files) from 1024 to 8290 Nov 21 13:38:34 voldemort.scrye.com unbound[4861]: [4861:0] warning: did not exit gracefully last time (30859) Nov 21 13:38:34 voldemort.scrye.com unbound[4861]: [4861:0] notice: init module 0: validator Nov 21 13:38:34 voldemort.scrye.com unbound[4861]: [4861:0] notice: init module 1: iterator Nov 21 13:38:34 voldemort.scrye.com unbound[4861]: [4861:0] info: start of service (unbound 1.5.0). [334209.703923] unbound[26854]: segfault at 7f7ace385444 ip 00007f79ce3ddf23 sp 00007f79cacc3550 error 4 in unbound[7f79ce38b000+d0000] [338071.162154] unbound[30866]: segfault at 7f3e4472d444 ip 00007f3d44785f23 sp 00007f3d4106b4b0 error 4 in unbound[7f3d44733000+d0000] I can try and run it in gdb and see if I can get a stack trace...
Hi. Please install the debuginfo package to get some stack trace. There seems to be a race condition in the new arc4random fallback code. However I would like to check if this is really the cause. Thanks!
A patch or link to svn was posted on the unbound mailing list. Currently in transit but will look at building updated packages later today if someone else hasn't picked this up by then.
(In reply to Paul Wouters from comment #2) > A patch or link to svn was posted on the unbound mailing list. Currently in > transit but will look at building updated packages later today if someone > else hasn't picked this up by then. That's where I got the information posed in comment #1. But I want to have a backtrace to be sure that it is really the issue!
Am seeing this in -2 as well. dmesg reports: Nov 23 18:58:17 turing-police kernel: [159160.906004] unbound[63601]: segfault at 7f71130fc444 ip 00007f7013154f23 sp 00007fffcc4c1af0 error 4 in unbound[7f7013102000+d0000] Nov 24 01:54:11 turing-police kernel: [184148.239571] unbound[119936]: segfault at 7fad3cb59444 ip 00007fac3cbb1f23 sp 00007fac39497550 error 4 in unbound[7fac3cb5f000+d0000] Nov 24 08:18:02 turing-police kernel: [207210.400281] unbound[170769]: segfault at 7f1d74d93444 ip 00007f1c74debf23 sp 00007f1c716d1650 error 4 Nov 24 08:18:02 turing-police kernel: [207210.400287] unbound[170767]: segfault at 7f1d74d93444 ip 00007f1c74debf23 sp 00007fff1e1e5e90 error 4 in unbound[7f1c74d99000+d0000] in unbound[7f1c74d99000+d0000] Nov 24 11:12:12 turing-police kernel: [ 265.725841] unbound[1362]: segfault at 7fb212aab444 ip 00007fb112b03f23 sp 00007fb10f3e9550 error 4 Nov 24 11:12:12 turing-police kernel: [ 265.725848] unbound[1348]: segfault at 7fb212aab444 ip 00007fb112b03f23 sp 00007ffffff31da0 error 4 in unbound[7fb112ab1000+d0000] Nov 24 11:16:04 turing-police kernel: [ 497.657898] unbound[2334]: segfault at 7fbc1d2b5444 ip 00007fbb1d30df23 sp 00007fbb19bf3480 error 4 in unbound[7fbb1d2bb000+d0000] Nov 24 11:16:04 turing-police kernel: [ 497.657920] unbound[2330]: segfault at 7fbc1d2b5444 ip 00007fbb1d30df23 sp 00007fff6ea88ae0 error 4 in unbound[7fbb1d2bb000+d0000] Nov 24 11:18:43 turing-police kernel: [ 656.454591] unbound[2618]: segfault at 7f3eedf50444 ip 00007f3dedfa8f23 sp 00007f3dea88e480 error 4 in unbound[7f3dedf56000+d0000] Nov 24 11:18:43 turing-police kernel: [ 656.454668] unbound[2617]: segfault at 7f3eedf50444 ip 00007f3dedfa8f23 sp 00007fffe5449620 error 4 in unbound[7f3dedf56000+d0000] The '444' and 'f23' being constant pretty much tells me there's exactly one crash involved, with ASLR in effect...
(In reply to Tomas Hozza from comment #3) > (In reply to Paul Wouters from comment #2) > > A patch or link to svn was posted on the unbound mailing list. Currently in > > transit but will look at building updated packages later today if someone > > else hasn't picked this up by then. > > That's where I got the information posed in comment #1. But I want to have a > backtrace to be sure that it is really the issue! I'll install the .debuginfo and attach a gdb to it and see if it crashes again...
Caught it, finally.... (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fbb293c3700 (LWP 1372)] arc4random () at compat/arc4random.c:220 220 _rs_random_u32(&val); (gdb) where #0 arc4random () at compat/arc4random.c:220 #1 0x00007fbb2cb0eb05 in arc4random_uniform (upper_bound=upper_bound@entry=10) at compat/arc4random_uniform.c:51 #2 0x00007fbb2cb007fc in ub_random_max (x=10, state=<optimized out>) at util/random.c:110 #3 iter_ns_probability (m=10, n=2, rnd=<optimized out>) at iterator/iter_utils.c:443 #4 query_for_targets.part.7.lto_priv.627 (qstate=qstate@entry=0x7fbb2490cdc0, iq=iq@entry=0x7fbb2490cff0, ie=ie@entry=0x7fbb2d9a27a0, id=id@entry=1, maxtargets=maxtargets@entry=3, num=num@entry=0x7fbb293c2800) at iterator/iterator.c:1377 #5 0x00007fbb2cad6b4f in query_for_targets () at iterator/iterator.c:1351 #6 processQueryTargets (id=1, ie=0x7fbb2d9a27a0, iq=0x7fbb2490cff0, qstate=0x7fbb2490cdc0) at iterator/iterator.c:1764 #7 iter_handle (qstate=qstate@entry=0x7fbb2490cdc0, iq=iq@entry=0x7fbb2490cff0, ie=ie@entry=0x7fbb2d9a27a0, id=id@entry=1) at iterator/iterator.c:2708 #8 0x00007fbb2cad92fa in process_response (event=<optimized out>, outbound=<optimized out>, id=<optimized out>, ie=<optimized out>, iq=<optimized out>, qstate=<optimized out>) at iterator/iterator.c:2869 #9 iter_operate (qstate=<optimized out>, event=<optimized out>, id=<optimized out>, outbound=<optimized out>) at iterator/iterator.c:2902 #10 0x00007fbb2cac9df1 in mesh_run (mesh=mesh@entry=0x7fbb245e6b90, ---Type <return> to continue, or q <return> to quit--- mstate=mstate@entry=0x7fbb2490cd70, ev=module_event_pass, ev@entry=module_event_new, e=<optimized out>, e@entry=0x0) at services/mesh.c:1076 #11 0x00007fbb2caaf211 in mesh_new_client (qid=52086, rep=0x7fbb293c2be0, edns=0x7fbb293c2a80, qflags=256, qinfo=0x7fbb293c2a90, mesh=0x7fbb245e6b90) at services/mesh.c:370 #12 worker_handle_request.part.17.lto_priv.656 (c=<optimized out>, arg=0x7fbb2d9c88e0, repinfo=0x7fbb293c2be0) at daemon/worker.c:986 #13 0x00007fbb2cabf362 in comm_point_udp_callback (fd=5, event=<optimized out>, arg=<optimized out>) at util/netevent.c:700 #14 0x00007fbb2c3be3cc in event_base_loop () from /lib64/libevent-2.0.so.5 #15 0x00007fbb2cabf4dc in comm_base_dispatch (b=<optimized out>) at util/netevent.c:305 #16 0x00007fbb2cad255c in worker_work (worker=0x7fbb2d9c88e0) at daemon/worker.c:1285 #17 thread_start (arg=0x7fbb2d9c88e0) at daemon/daemon.c:475 #18 0x00007fbb2b70f53a in start_thread (arg=0x7fbb293c3700) at pthread_create.c:312 #19 0x00007fbb2b4462fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) quit Yep, died in arc4random(). Hope that helps....
Sorry for the delay from my side. I had to get things done, so I just set systemd to restart unbound when it crashed. I attached gdb, but forgot to disable systemd restarting, but I got: Program received signal SIGSEGV, Segmentation fault. arc4random () at compat/arc4random.c:220 220 _rs_random_u32(&val); so, it's definitely the same place here at least.
Thanks for the confirmation. I'll backport the fix and push new builds soon.
if you create new builds, please also address the "main" issue from bug https://bugzilla.redhat.com/show_bug.cgi?id=1167291 A simple test is to restart the libreswan service
(In reply to Paul Wouters from comment #9) > if you create new builds, please also address the "main" issue from bug > https://bugzilla.redhat.com/show_bug.cgi?id=1167291 > > A simple test is to restart the libreswan service Too late. The build are already completed. unbound-1.5.0-2.fc22 unbound-1.5.0-2.fc21 Could just libreswan rebuild against the new unbound help?
unbound-1.5.0-1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/FEDORA-2014-15408/unbound-1.5.0-1.fc21
I'm not sure if rebuilding helps. The new arc4random wants to do some initialization for the randomness in the library. I think this affects everything linked against libunbound
Package unbound-1.5.0-2.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing unbound-1.5.0-2.fc21' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-15408/unbound-1.5.0-2.fc21 then log in and leave karma (feedback).
Grabbed -2 off koji, installed, 24 hours uptime now, no problems seen. I left a note on the admin page, forgot to login first. :)
unbound-1.5.0-3.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/unbound-1.5.0-3.fc21
unbound-1.5.1-0.1.rc1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/unbound-1.5.1-0.1.rc1.fc21
unbound-1.5.1-2.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/unbound-1.5.1-2.fc21
unbound-1.5.1-2.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.