Bug 1166878
Summary: | unbound-1.5.0-1.fc22 crashing often | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Kevin Fenzi <kevin> |
Component: | unbound | Assignee: | Tomáš Hozza <thozza> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | kevin, pj.pandit, psimerda, pwouters, thozza, valdis.kletnieks, vonsch |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | unbound-1.5.1-2.fc21 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-12-04 14:53:03 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Kevin Fenzi
2014-11-21 20:42:12 UTC
Hi. Please install the debuginfo package to get some stack trace. There seems to be a race condition in the new arc4random fallback code. However I would like to check if this is really the cause. Thanks! A patch or link to svn was posted on the unbound mailing list. Currently in transit but will look at building updated packages later today if someone else hasn't picked this up by then. (In reply to Paul Wouters from comment #2) > A patch or link to svn was posted on the unbound mailing list. Currently in > transit but will look at building updated packages later today if someone > else hasn't picked this up by then. That's where I got the information posed in comment #1. But I want to have a backtrace to be sure that it is really the issue! Am seeing this in -2 as well. dmesg reports: Nov 23 18:58:17 turing-police kernel: [159160.906004] unbound[63601]: segfault at 7f71130fc444 ip 00007f7013154f23 sp 00007fffcc4c1af0 error 4 in unbound[7f7013102000+d0000] Nov 24 01:54:11 turing-police kernel: [184148.239571] unbound[119936]: segfault at 7fad3cb59444 ip 00007fac3cbb1f23 sp 00007fac39497550 error 4 in unbound[7fac3cb5f000+d0000] Nov 24 08:18:02 turing-police kernel: [207210.400281] unbound[170769]: segfault at 7f1d74d93444 ip 00007f1c74debf23 sp 00007f1c716d1650 error 4 Nov 24 08:18:02 turing-police kernel: [207210.400287] unbound[170767]: segfault at 7f1d74d93444 ip 00007f1c74debf23 sp 00007fff1e1e5e90 error 4 in unbound[7f1c74d99000+d0000] in unbound[7f1c74d99000+d0000] Nov 24 11:12:12 turing-police kernel: [ 265.725841] unbound[1362]: segfault at 7fb212aab444 ip 00007fb112b03f23 sp 00007fb10f3e9550 error 4 Nov 24 11:12:12 turing-police kernel: [ 265.725848] unbound[1348]: segfault at 7fb212aab444 ip 00007fb112b03f23 sp 00007ffffff31da0 error 4 in unbound[7fb112ab1000+d0000] Nov 24 11:16:04 turing-police kernel: [ 497.657898] unbound[2334]: segfault at 7fbc1d2b5444 ip 00007fbb1d30df23 sp 00007fbb19bf3480 error 4 in unbound[7fbb1d2bb000+d0000] Nov 24 11:16:04 turing-police kernel: [ 497.657920] unbound[2330]: segfault at 7fbc1d2b5444 ip 00007fbb1d30df23 sp 00007fff6ea88ae0 error 4 in unbound[7fbb1d2bb000+d0000] Nov 24 11:18:43 turing-police kernel: [ 656.454591] unbound[2618]: segfault at 7f3eedf50444 ip 00007f3dedfa8f23 sp 00007f3dea88e480 error 4 in unbound[7f3dedf56000+d0000] Nov 24 11:18:43 turing-police kernel: [ 656.454668] unbound[2617]: segfault at 7f3eedf50444 ip 00007f3dedfa8f23 sp 00007fffe5449620 error 4 in unbound[7f3dedf56000+d0000] The '444' and 'f23' being constant pretty much tells me there's exactly one crash involved, with ASLR in effect... (In reply to Tomas Hozza from comment #3) > (In reply to Paul Wouters from comment #2) > > A patch or link to svn was posted on the unbound mailing list. Currently in > > transit but will look at building updated packages later today if someone > > else hasn't picked this up by then. > > That's where I got the information posed in comment #1. But I want to have a > backtrace to be sure that it is really the issue! I'll install the .debuginfo and attach a gdb to it and see if it crashes again... Caught it, finally.... (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fbb293c3700 (LWP 1372)] arc4random () at compat/arc4random.c:220 220 _rs_random_u32(&val); (gdb) where #0 arc4random () at compat/arc4random.c:220 #1 0x00007fbb2cb0eb05 in arc4random_uniform (upper_bound=upper_bound@entry=10) at compat/arc4random_uniform.c:51 #2 0x00007fbb2cb007fc in ub_random_max (x=10, state=<optimized out>) at util/random.c:110 #3 iter_ns_probability (m=10, n=2, rnd=<optimized out>) at iterator/iter_utils.c:443 #4 query_for_targets.part.7.lto_priv.627 (qstate=qstate@entry=0x7fbb2490cdc0, iq=iq@entry=0x7fbb2490cff0, ie=ie@entry=0x7fbb2d9a27a0, id=id@entry=1, maxtargets=maxtargets@entry=3, num=num@entry=0x7fbb293c2800) at iterator/iterator.c:1377 #5 0x00007fbb2cad6b4f in query_for_targets () at iterator/iterator.c:1351 #6 processQueryTargets (id=1, ie=0x7fbb2d9a27a0, iq=0x7fbb2490cff0, qstate=0x7fbb2490cdc0) at iterator/iterator.c:1764 #7 iter_handle (qstate=qstate@entry=0x7fbb2490cdc0, iq=iq@entry=0x7fbb2490cff0, ie=ie@entry=0x7fbb2d9a27a0, id=id@entry=1) at iterator/iterator.c:2708 #8 0x00007fbb2cad92fa in process_response (event=<optimized out>, outbound=<optimized out>, id=<optimized out>, ie=<optimized out>, iq=<optimized out>, qstate=<optimized out>) at iterator/iterator.c:2869 #9 iter_operate (qstate=<optimized out>, event=<optimized out>, id=<optimized out>, outbound=<optimized out>) at iterator/iterator.c:2902 #10 0x00007fbb2cac9df1 in mesh_run (mesh=mesh@entry=0x7fbb245e6b90, ---Type <return> to continue, or q <return> to quit--- mstate=mstate@entry=0x7fbb2490cd70, ev=module_event_pass, ev@entry=module_event_new, e=<optimized out>, e@entry=0x0) at services/mesh.c:1076 #11 0x00007fbb2caaf211 in mesh_new_client (qid=52086, rep=0x7fbb293c2be0, edns=0x7fbb293c2a80, qflags=256, qinfo=0x7fbb293c2a90, mesh=0x7fbb245e6b90) at services/mesh.c:370 #12 worker_handle_request.part.17.lto_priv.656 (c=<optimized out>, arg=0x7fbb2d9c88e0, repinfo=0x7fbb293c2be0) at daemon/worker.c:986 #13 0x00007fbb2cabf362 in comm_point_udp_callback (fd=5, event=<optimized out>, arg=<optimized out>) at util/netevent.c:700 #14 0x00007fbb2c3be3cc in event_base_loop () from /lib64/libevent-2.0.so.5 #15 0x00007fbb2cabf4dc in comm_base_dispatch (b=<optimized out>) at util/netevent.c:305 #16 0x00007fbb2cad255c in worker_work (worker=0x7fbb2d9c88e0) at daemon/worker.c:1285 #17 thread_start (arg=0x7fbb2d9c88e0) at daemon/daemon.c:475 #18 0x00007fbb2b70f53a in start_thread (arg=0x7fbb293c3700) at pthread_create.c:312 #19 0x00007fbb2b4462fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) quit Yep, died in arc4random(). Hope that helps.... Sorry for the delay from my side. I had to get things done, so I just set systemd to restart unbound when it crashed. I attached gdb, but forgot to disable systemd restarting, but I got: Program received signal SIGSEGV, Segmentation fault. arc4random () at compat/arc4random.c:220 220 _rs_random_u32(&val); so, it's definitely the same place here at least. Thanks for the confirmation. I'll backport the fix and push new builds soon. if you create new builds, please also address the "main" issue from bug https://bugzilla.redhat.com/show_bug.cgi?id=1167291 A simple test is to restart the libreswan service (In reply to Paul Wouters from comment #9) > if you create new builds, please also address the "main" issue from bug > https://bugzilla.redhat.com/show_bug.cgi?id=1167291 > > A simple test is to restart the libreswan service Too late. The build are already completed. unbound-1.5.0-2.fc22 unbound-1.5.0-2.fc21 Could just libreswan rebuild against the new unbound help? unbound-1.5.0-1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/FEDORA-2014-15408/unbound-1.5.0-1.fc21 I'm not sure if rebuilding helps. The new arc4random wants to do some initialization for the randomness in the library. I think this affects everything linked against libunbound Package unbound-1.5.0-2.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing unbound-1.5.0-2.fc21' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-15408/unbound-1.5.0-2.fc21 then log in and leave karma (feedback). Grabbed -2 off koji, installed, 24 hours uptime now, no problems seen. I left a note on the admin page, forgot to login first. :) unbound-1.5.0-3.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/unbound-1.5.0-3.fc21 unbound-1.5.1-0.1.rc1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/unbound-1.5.1-0.1.rc1.fc21 unbound-1.5.1-2.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/unbound-1.5.1-2.fc21 unbound-1.5.1-2.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report. |