Description of problem: named crashed with a segfault after transferring a zone from the master and sending out notifies. The server is configured as a slave server with multiple zones and views and a recursive query caching server. There are no other servers that it needs to be notifying, but the default configuration appears to invoke the "sending notifies" routine. Version-Release number of selected component (if applicable): 9.5.0-18.a7.fc8 Actual results: Here are some syslog messages from right before the crash: Nov 26 19:21:19 slave named[25468]: client w.x.y.z#53320: view global: received notify for zone 'SUB.DOM.FOO' Nov 26 19:21:19 slave named[25468]: client w.x.y.z#53320: view global: received notify for zone 'x.w.IN-ADDR.ARPA' Nov 26 19:21:19 slave named[25468]: zone SUB.DOM.FOO/IN/global: Transfer started. Nov 26 19:21:19 slave named[25468]: transfer of 'SUB.DOM.FOO/IN/global' from w.x.y.z#53: connected using w.x.a.b#52660 Nov 26 19:21:19 slave named[25468]: zone SUB.DOM.FOO/IN/global: transferred serial 357792 Nov 26 19:21:19 slave named[25468]: transfer of 'SUB.DOM.FOO/IN/global' from w.x.y.z#53: end of transfer Nov 26 19:21:19 slave named[25468]: zone SUB.DOM.FOO/IN/global: sending notifies (serial 357792) Nov 26 19:21:20 slave named[25468]: zone x.w.IN-ADDR.ARPA/IN/global: Transfer started. Nov 26 19:21:20 slave named[25468]: transfer of 'x.w.IN-ADDR.ARPA/IN/global' from w.x.y.z#53: connected using w.x.a.b#52373 Nov 26 19:21:20 slave named[25468]: zone x.w.IN-ADDR.ARPA/IN/global: transferred serial 382352 Nov 26 19:21:20 slave named[25468]: transfer of 'x.w.IN-ADDR.ARPA/IN/global' from w.x.y.z#53: end of transfer Nov 26 19:21:20 slave named[25468]: zone x.w.IN-ADDR.ARPA/IN/global: sending notifies (serial 382352) Nov 26 19:21:20 slave kernel: named[25471]: segfault at 00000014 eip 001a7996 esp b6a82db0 error 4 Additional Info: The configuration has not changed. The system was recently upgraded from FC6 to F8. The bind package was updated on November 23, and had been running since then without problems until today.
Are you able to reproduce the problem? Do you have a core dump?
Could you please tell me what arch do you have (and if you're running named multithreaded) and attach configuration file + full log? Thanks
Problem hasn't happened again yet. I don't have a core dump, but this is what I have done to hopefully get a core dump if it happens again: /etc/sysconfig/named: DAEMON_COREFILE_LIMIT=unlimited /etc/profile: ulimit -S -c unlimited > /dev/null 2>&1 # getsebool -a | grep core allow_daemons_dump_core --> on /etc/named.conf: options { coresize unlimited; }; This is i386 (i686) arch. Not sure if I'm running multithreaded--the /etc/sysconfig/named is default except for modification mentioned above.
Thanks for your cooperation, best approach. You will see how many threads you have in log. For example Nov 29 09:12:31 evileye named[18117]: starting BIND 9.5.0a7 -u named Nov 29 09:12:31 evileye named[18117]: found 2 CPUs, using 2 worker threads
Now named has crashed on a different server. I've now set up all my named's to generate coredumps. Hopefully we won't have to wait more than a few more days for one of them to crash, although they had all been running since Nov 22, so it may take as long as 8 days from today since I just restarted them all. This time it was the shadow master named that crashed, again right after "sending notifies": Nov 30 09:49:51 master named[7316]: zone SUB.FOO.DOM/IN/global: sending notifies (serial 379376) Nov 30 09:49:51 master kernel: named[7317]: segfault at 0000001c eip 001a7996 esp b7ea9db0 error 4 The master is running 4 threads: Nov 30 09:56:14 master named[7681]: found 4 CPUs, using 4 worker threads The previous slave that crashed is also running 4 threads: Nov 26 22:31:34 slave2 named[31792]: found 4 CPUs, using 4 worker threads
Yes, it looks that crash will be related to notify messages. I recommend you add "notify master-only;" option to control statement on slave servers. It will improve stability and I'm going to try reproduce this problem next week in our lab.
It crashed again twice on the master, but didn't generate a core dump. I'm not sure what is going on...I had put SELinux into permissive mode because allow_daemons_dump_core only allows coredumps in / (named's CWD is /var/named). So SELinux cannot be causing the coredumps to fail. Dec 3 12:57:22 master kernel: named[18004]: segfault at 00000014 eip 001a7996 esp b7ea2db0 error 4 Dec 3 13:13:24 master named[21934]: starting BIND 9.5.0a7 -u named Dec 3 13:13:24 master named[21934]: found 4 CPUs, using 4 worker threads Any suggestions on how to get the thing to dump core next time?
By doing "sh -x /etc/init.d/named restart" I confirm that ulimit is being set correctly before starting the process: + /bin/bash -c 'ulimit -S -c unlimited >/dev/null 2>&1 ; /usr/sbin/named -u named' I turned on debugging: /etc/sysconfig/named: OPTIONS="-d 99" /etc/named.conf: logging { channel default_debug { file "data/named.run"; severity dynamic; }; };
I did some investigation about coredumps and it looks that named wrongly sets limit itself. I'm not able to generate coredump with all versions (9.3, 9.4 and 9.5). Let me check it, looks like next bug for me. Also be careful with -d99 option because log will be really huge.
I the end problem with coredumps is in setuid function (see man setuid(2)). named is not able to generate core files when is started with -u option. You don't have to try get core file because it's quite risky run named without -u. I'm also trying reproduce this one.
named wouldn't start without the -u named option, or with -u root substituted. It seems that it is dropping privs (CAP_DAC_OVERRIDE?) even when not doing setuid, such that it cannot write to /var/named/data or indeed any of the logging channel files which are owned by named.
Interesting, this piece of code appears to not be working: bin/named/unix/os.c:258 /* allow us to continue to be able to generate coredumps after setuid: */ prctl(PR_SET_DUMPABLE,1,0,0,0);
Poking around on the affected servers, I found /proc/sys/fs/suid_dumpable was set to 0. Neither of the man pages (proc(5) and prctl(2)) specify which setting takes preference when they conflict, so just to be safe I've set fs/suid_dumpable to 2 on all the relevant servers.
(In reply to comment #12) > bin/named/unix/os.c:258 > /* allow us to continue to be able to generate coredumps after setuid: > > */ > prctl(PR_SET_DUMPABLE,1,0,0,0); > Upstream also doesn't know if this statement works as expected :) I've setup two slaves and one master and I'm trying to reproduce this problem with simple script: #!/bin/bash let a=0 while [ 0 -eq -0 ]; do cat /var/named/test | sed 's/[0-9]*;serial/'`echo $a`';serial/' > /var/named/test.new mv -f /var/named/test.new /var/named/test rndc reload; ((++a)) sleep 3; done I think I should get core file tomorrow.
Re Comment #13, I don't think setting /proc/sys/fs/suid_dumpable to 2 is a good idea. It should be set to 1 instead: PR_SET_DUMPABLE (Since Linux 2.3.20) Set the state of the flag determining whether core dumps are produced for this process upon delivery of a signal whose default behaviour is to produce a core dump. (Normally this flag is set for a process by default, but it is cleared when a set-user-ID or set-group-ID program is executed and also by various system calls that manipulate process UIDs and GIDs). In kernels up to and including 2.6.12, arg2 must be either 0 (process is not dumpable) or 1 (process is dumpable). Between kernels 2.6.13 and 2.6.17, the value 2 was also permit- ted, which caused any binary which normally would not be dumped to be dumped readable by root only; for security reasons, this feature has been removed. (See also the description of /proc/sys/fs/suid_dumpable in proc(5).) proc(5) doesn't mention the comment about the 2 feature being removed after 2.6.17, but I would rather be conservative about it right now, and just set /proc/sys/fs/suid_dumpable to 1. Also, I noticed that /usr/sbin/named was set to mode 750. I don't know why, but this seems like a bug to me. There is no reason binaries in /usr/sbin shouldn't be mode 755 just like binaries in /usr/bin. I've changed the permissions to 755 due to another comment I read in proc(5): "core dump will not be produced for a process ... whose binary does not have read permission enabled.": /proc/sys/fs/suid_dumpable (since Linux 2.6.13) The value in this file determines whether core dump files are produced for set-user-ID or otherwise protected/tainted bina- ries. Three different integer values can be specified: 0 (default) This provides the traditional (pre-Linux 2.6.13) behaviour. A core dump will not be produced for a process which has changed credentials (by calling seteuid(2), setgid(2), or similar, or by executing a set-user-ID or set-group-ID program) or whose binary does not have read permission enabled. 1 ("debug") All processes dump core when possible. The core dump is owned by the file system user ID of the dumping process and no security is applied. This is intended for system debug- ging situations only. Ptrace is unchecked. 2 ("suidsafe") Any binary which normally would not be dumped (see "0" above) is dumped readable by root only. This allows the user to remove the core dump file but not to read it. For security reasons core dumps in this mode will not overwrite one another or other files. This mode is appropriate when adminis- trators are attempting to debug problems in a normal environ- ment.
Two slaves run five days without segfault. I still don't know where problem is...
Anyone checked to see if ~!@#@$%%^& SELinux is getting in the way? The directory listed in named.conf is SUPPOSED to be WRITABLE by the user named is running as.
Yes, SELinux has been checked. The code in BIND that sets/drops capabilities is a huge mess. There are bugs lurking in there. I haven't had time to sort it out yet, but I'm pretty sure thats why it won't dump core even with SELinux disabled.
I'm seeing a similar problem on our master system. Thought I'd post some logfile info if it was of any help... [root@myserver ~]# grep -B1 segfault /var/log/messages* /var/log/messages-Jan 9 04:16:42 myserver named[12846]: client 10.1.1.106#35938: updating zone 'mydomain.com/IN': deleting rrset at 'cwestber.mydomain.com' A /var/log/messages:Jan 9 04:16:42 myserver kernel: named[12847]: segfault at 0000000c eip 001a7996 esp b7ec0db0 error 4 -- /var/log/messages-20071230-Dec 28 09:54:22 myserver named[1947]: client 10.1.1.30#34343: updating zone 'mydomain.com/IN': deleting rrset at 'dgarces.mydomain.com' A /var/log/messages-20071230:Dec 28 09:54:22 myserver kernel: named[1948]: segfault at 00000034 eip 001a7996 esp b7e4edb0 error 4 -- /var/log/messages-20071230-Dec 28 12:56:08 myserver named[7758]: client 10.1.0.210#54302: updating zone 'mydomain.com/IN': deleting rrset at 'wanderso.mydomain.com' A /var/log/messages-20071230:Dec 28 12:56:08 myserver kernel: named[7759]: segfault at 0000000c eip 001a7996 esp b7e65db0 error 4 As you can see, the eip and esp numbers are the same as in the segfault reported above. I'm running bind-9.5.0-20.b1.fc8. Let me know if there's any more info or help I could provide to help solve this.
(In reply to comment #19) Yes, looks that named always segfault on same point. But I cannot find where exactly. I disassembled named and all libraries and I can't find where exactly eip points :( Looks like this approach is unusable for PIE binaries. Also from BIND CVS it looks that upstream doing some work to get named able to generate core files. I'm going to check situation with newest codebase. I can do nothing with this before I got core file. Reading code without any knowledge where exactly problem is doesn't make sence
I dug how get core dump and this is solution: - change owner/permissions of /var/named directory to root:named 775 - allow SELinux write there (simpliest way is Permissive mode) - nothing else is needed Best should be get core file from binaries whose are compiled with -O0. If it is possible please use those: http://koji.fedoraproject.org/koji/taskinfo?taskID=361783 . Core file may be very large especially on production servers so you could use max-cache-size option (for example 32MB) to reduce number or cached records. Thanks, Adam
Using bind-9.5.0-23.b1.fc8, I changed the CWD to /var/named/data and got this coredump finally: Core was generated by `/usr/sbin/named -u named'. Program terminated with signal 11, Segmentation fault. #0 0x001a7996 in decrement_reference (rbtdb=0xb27193a8, node=0xb32da3f8, least_serial=23075, nlock=isc_rwlocktype_write, tlock=isc_rwlocktype_none) at rbtdb.c:1638 1638 ISC_LIST_APPEND(rbtdb->deadnodes[bucket], node, deadlink); (gdb) where #0 0x001a7996 in decrement_reference (rbtdb=0xb27193a8, node=0xb32da3f8, least_serial=23075, nlock=isc_rwlocktype_write, tlock=isc_rwlocktype_none) at rbtdb.c:1638 #1 0x001a9e7f in closeversion (db=0xb27193a8, versionp=0xb0c59044, commit=isc_boolean_false) at rbtdb.c:1927 #2 0x00168353 in dns_db_closeversion (db=0xb27193a8, versionp=0xb0c59044, commit=isc_boolean_false) at db.c:428 #3 0x0018c0aa in dns_dumpctx_detach (dctxp=0xb610a328) at masterdump.c:1024 #4 0x0018c92b in dump_quantum (task=0xb7f24070, event=0x0) at masterdump.c:1158 #5 0x00446a32 in run (uap=0xb7f12008) at task.c:874 #6 0x0048b50b in start_thread () from /lib/libpthread-2.7.so #7 0x00572b2e in clone () from /lib/libc-2.7.so
Patch for this issue is commited in upstream CVS
bind-9.5.0-24.b1.fc8 has been submitted as an update for Fedora 8
Patch from upstream is still not correct, delaying update
bind-9.5.0-25.b2.fc8 has been submitted as an update for Fedora 8
bind-9.5.0-25.b2.fc8 has been pushed to the Fedora 8 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update bind'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F8/FEDORA-2008-1622
bind-9.5.0-25.b2.fc8 works much better for me.
I've had MULTIPLE crashes since upgrading to bind-9.5.0-25.b2.fc8. :/usr/sbin/named -v BIND 9.5.0b2 :rpm -q bind bind-9.5.0-25.b2.fc8 I was running a patched version compiled from source and had no problems until switching back to the RPM.
I just re-confirmed that the version I posted above is still crashing. In fact it's crashing MORE often than the previous RPM. I'm going to have to switch back to my source compiled version for stability.
(In reply to comment #30) > I just re-confirmed that the version I posted above is still crashing. In fact > it's crashing MORE often than the previous RPM. I'm going to have to switch back > to my source compiled version for stability. Would it be possible provide more info, please? In which situation named crashes? Is this related to zone transfer? Would it be possible to get core file (or backtrace at least)? Thanks
Sorry I should have been more explicit. Yes it occurs during (or immediately after) a zone transfer. I have a specific zone the gets transferred once every couple of seconds (DDNS updates), and eventually causes the crash. I can post the core file somewhere if you think it will help. It's about 100 megs though. Running gdb on it I don't get much info except that it was a segfault. Thread 7 (process 24285): #0 0x0012d410 in __kernel_vsyscall () No symbol table info available. #1 0x004c9c07 in sigsuspend () from /lib/libc.so.6 No symbol table info available. #2 0x0044beb2 in isc_app_run () from /usr/lib/libisc.so.41 No symbol table info available. #3 0xb7f2026a in main () from /usr/sbin/named No symbol table info available. Lots of "No symbol table info available" in the gdb output. Willing to help with whatever, just advise me what the next step is.
I have to retract my previous statements about this version. I've had a couple crashes recently. Here is a backtrace: (gdb) thread apply all bt Thread 7 (process 18856): #0 0x0012d402 in __kernel_vsyscall () #1 0x004c9c07 in sigsuspend () from /lib/libc.so.6 #2 0x0044beb2 in isc_app_run () at app.c:533 #3 0xb7f8d26a in main (argc=0, argv=0xbfe0cb84) at ./main.c:878 Thread 6 (process 18857): #0 0x0012d402 in __kernel_vsyscall () #1 0x004905d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00447bef in run (uap=0xb7ef7008) at task.c:822 #3 0x0048c50b in start_thread () from /lib/libpthread.so.0 #4 0x00573b2e in clone () from /lib/libc.so.6 Thread 5 (process 18858): #0 0x0012d402 in __kernel_vsyscall () #1 0x00493119 in __lll_lock_wait () from /lib/libpthread.so.0 #2 0x0048e89e in _L_lock_88 () from /lib/libpthread.so.0 #3 0x0048e3aa in pthread_mutex_lock () from /lib/libpthread.so.0 #4 0x00458c25 in internal_recv (me=0xb24fe688, ev=0xb1dd9b70) at socket.c:2240 #5 0x00447d72 in run (uap=0xb7ef7008) at task.c:874 #6 0x0048c50b in start_thread () from /lib/libpthread.so.0 #7 0x00573b2e in clone () from /lib/libc.so.6 ---Type <return> to continue, or q <return> to quit--- Thread 4 (process 18859): #0 0x0012d402 in __kernel_vsyscall () #1 0x004905d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00447bef in run (uap=0xb7ef7008) at task.c:822 #3 0x0048c50b in start_thread () from /lib/libpthread.so.0 #4 0x00573b2e in clone () from /lib/libc.so.6 Thread 3 (process 18860): #0 0x0012d402 in __kernel_vsyscall () #1 0x004905d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00447bef in run (uap=0xb7ef7008) at task.c:822 #3 0x0048c50b in start_thread () from /lib/libpthread.so.0 #4 0x00573b2e in clone () from /lib/libc.so.6 Thread 2 (process 18861): #0 0x0012d402 in __kernel_vsyscall () #1 0x00490902 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x0045de0c in isc_condition_waituntil (c=0xb7ef8040, m=0xb7ef8010, t=0xb7ef8038) at condition.c:59 #3 0x0044a510 in run (uap=0xb7ef8008) at timer.c:719 #4 0x0048c50b in start_thread () from /lib/libpthread.so.0 ---Type <return> to continue, or q <return> to quit--- #5 0x00573b2e in clone () from /lib/libc.so.6 Thread 1 (process 18862): #0 0x0012d402 in __kernel_vsyscall () #1 0x004c9690 in raise () from /lib/libc.so.6 #2 0x004caf91 in abort () from /lib/libc.so.6 #3 0xb7f8ca8c in assertion_failed (file=0x465b3e "socket.c", line=1839, type=isc_assertiontype_insist, cond=0x465b87 "!sock->pending_recv") at ./main.c:159 #4 0x0045bf20 in watcher (uap=0xb8210238) at socket.c:1839 #5 0x0048c50b in start_thread () from /lib/libpthread.so.0 #6 0x00573b2e in clone () from /lib/libc.so.6
(In reply to comment #32) > Lots of "No symbol table info available" in the gdb output. Willing to help with > whatever, just advise me what the next step is. You have to install bind-debuginfo package which contains debug symbols (make sure that you have same bind and bind-debuginfo). With debuginfo installed you can get usable debugging information.
(In reply to comment #33) > I have to retract my previous statements about this version. I've had a couple > crashes recently. Here is a backtrace: Is backtrace from master or slave and could it be related to zone transfers? It seems like different problem than 25.b2 fixes for me.
It is from the master. I don't know what it might be related to. I don't see the same "segfault" log entry in /var/log/messages at the time the core file was generated, but I do see these messages at the time of the core file generation: Feb 28 11:07:03 master named[18856]: client w.x.y.z#62589: view global: signer "foo.dhcp" approved Feb 28 11:07:03 master named[18856]: zone foo.domain/IN/global: sending notifies (serial 2010784390) Feb 28 11:07:03 master named[18856]: client w.x.y.x#62590: view global: signer "x.w.in-addr.arpa.dhcp" approved Feb 28 11:07:03 master named[18856]: zone x.w.IN-ADDR.ARPA/IN/global: sending notifies (serial 2010693532) Feb 28 11:07:03 master named[18856]: socket.c:1839: INSIST(!sock->pending_recv) failed Feb 28 11:07:03 master named[18856]: exiting (due to assertion failure)
(In reply to comment #36) This is different issue than original segfault. Would it be possible test http://koji.fedoraproject.org/koji/taskinfo?taskID=570357 please? Problem should be fixed there. Thanks
Ok, I've loaded that build and will be watching it over the next week to see if it solves the problem. Thanks.
(In reply to comment #38) Thank you for your help with testing
bind-9.5.0-26.b3.fc8 has been submitted as an update for Fedora 8
Patch for issue written in comment #36 seems right (one week without crash) so let's put it to updates-testing.
Hm, update system doesn't write message here so adding it manually: bind-9.5.0-26.b3.fc8 has been pushed to the Fedora 8 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update bind'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F8/FEDORA-2008-3402
I've had no crashes since bind-9.5.0-25.3.b2.fc8, and bind-9.5.0-26.b3.fc8 has been running fine as well since April 30. +1 from me for pushing this to updates.
bind-9.5.0-26.b3.fc8 has been pushed to the Fedora 8 stable repository. If problems still persist, please make note of it in this bug report.
I updated today to bind-9.5.0-26.b3.fc8 and had 5 crashes: May 12 14:40:11 server named[2665]: notify: info: client xxx.xxx.xxx.xxx#1035: received notify for zone 'aaa.bb' May 12 14:41:15 server named[2665]:last message repeated 4 times May 12 14:42:16 server named[2665]:last message repeated 4 times May 12 14:43:21 server named[2665]:last message repeated 3 times May 12 14:44:32 server named[2665]:last message repeated 5 times May 12 14:45:43 server named[2665]:last message repeated 4 times May 12 14:46:41 server named[2665]:last message repeated 2 times May 12 14:46:41 server kernel: named[2667]: segfault at 00000000 eip 0015b1c7 esp b74dbf50 error 4 Always like this. After the notify for the specific zone I get the segfault. I run slave for this domain but I have plenty master zones. I downgraded to bind-9.5.0-23.b1.fc8 (which was quite difficult for me to find it... :-/ ) I didn't get a segfault in 5 other servers I run bind, but I downgraded there as well just in case. Version is Fedora Core 8 2.6.24.5-85.fc8 i686 2 x Pentium(R) 4 3.40GHz I don't have a core dump. Regards, Giannis
just for the record remote server was VERSION.BIND text = "8.2.2-P5" :(( It's quite old and I cannot do anything about it...
Without core dump I can do nothing. Would it be possible get it, please? (backtrace is sufficient) - make /var/named writable by named user (chmod/chown + SELinux set correctly). You should not see message "the working directory is not writable" in log - wait for crash - run "gdb /usr/sbin/named <core_file>" (make sure you have bind-debuginfo installed) - attach output when you type "t a a bt" Thanks
I have the same issue with Fedora 8, bind-9.5.0-26.b3.fc8, and bind-9.5.0- 29.b2.fc8 (Fedora 9 srpm built on Fedora 8). I have the following installed: bind-9.5.0-29.b2.fc8 bind-chroot-9.5.0-29.b2.fc8 bind-debuginfo-9.5.0-29.b2.fc8 bind-devel-9.5.0-29.b2.fc8 bind-libs-9.5.0-29.b2.fc8 bind-utils-9.5.0-29.b2.fc8 Log Message: kernel: named[29341]: segfault at 00000000 eip 0015ae27 esp b7f280c0 error 4 GDB output on corefile: GNU gdb Red Hat Linux (6.6-45.fc8rh) Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"... Using host libthread_db library "/lib/libthread_db.so.1". warning: Can't read pathname for load map: Input/output error. Loaded symbols for /usr/sbin/named Reading symbols from /usr/lib/liblwres.so.40.0.0...Reading symbols from /usr/ lib/debug/usr/lib/liblwres.so.40.0.0.debug...done. done. Loaded symbols for /usr/lib/liblwres.so.40.0.0 Reading symbols from /usr/lib/libdns.so.41.0.0...Reading symbols from /usr/lib/ debug/usr/lib/libdns.so.41.0.0.debug...done. done. Loaded symbols for /usr/lib/libdns.so.41.0.0 Reading symbols from /usr/lib/libbind9.so.40.0.1...Reading symbols from /usr/ lib/debug/usr/lib/libbind9.so.40.0.1.debug...done. done. Loaded symbols for /usr/lib/libbind9.so.40.0.1 Reading symbols from /usr/lib/libisccfg.so.40.0.1...Reading symbols from /usr/ lib/debug/usr/lib/libisccfg.so.40.0.1.debug...done. done. Loaded symbols for /usr/lib/libisccfg.so.40.0.1 Reading symbols from /usr/lib/libgssapi_krb5.so.2...done. Loaded symbols for /usr/lib/libgssapi_krb5.so.2 Reading symbols from /lib/libcrypto.so.6...done. Loaded symbols for /lib/libcrypto.so.6 Reading symbols from /usr/lib/libisccc.so.40.0.0...Reading symbols from /usr/ lib/debug/usr/lib/libisccc.so.40.0.0.debug...done. done. Loaded symbols for /usr/lib/libisccc.so.40.0.0 Reading symbols from /usr/lib/libisc.so.41.0.0...Reading symbols from /usr/lib/ debug/usr/lib/libisc.so.41.0.0.debug...done. done. Loaded symbols for /usr/lib/libisc.so.41.0.0 Reading symbols from /lib/libcap.so.1...done. Loaded symbols for /lib/libcap.so.1 Reading symbols from /lib/libnsl.so.1...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libpthread.so.0...done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /usr/lib/libxml2.so.2...done. Loaded symbols for /usr/lib/libxml2.so.2 Reading symbols from /lib/libz.so.1...done. Loaded symbols for /lib/libz.so.1 Reading symbols from /lib/libm.so.6...done. Loaded symbols for /lib/libm.so.6 Reading symbols from /lib/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /usr/lib/libkrb5.so.3...done. Loaded symbols for /usr/lib/libkrb5.so.3 Reading symbols from /usr/lib/libk5crypto.so.3...done. Loaded symbols for /usr/lib/libk5crypto.so.3 Reading symbols from /lib/libcom_err.so.2...done. Loaded symbols for /lib/libcom_err.so.2 Reading symbols from /usr/lib/libkrb5support.so.0...done. Loaded symbols for /usr/lib/libkrb5support.so.0 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/libkeyutils.so.1...done. Loaded symbols for /lib/libkeyutils.so.1 Reading symbols from /lib/libresolv.so.2...done. Loaded symbols for /lib/libresolv.so.2 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/libselinux.so.1...done. Loaded symbols for /lib/libselinux.so.1 Reading symbols from /lib/libnss_files.so.2...done. Loaded symbols for /lib/libnss_files.so.2 Core was generated by `/usr/sbin/named -u named -t /var/named/chroot'. Program terminated with signal 11, Segmentation fault. #0 0x0015ae27 in dns_acl_match (reqaddr=0xb7e421d4, reqsigner=0x0, acl=0xb6025248, env=0xb7e500a8, match=0xb7e421d0, matchelt=0x0) at acl.c:226 226 if (*(isc_boolean_t *) node->data[ISC_IS6(family)] == ISC_TRUE) (gdb) quit
And the backtrace (oops): (gdb) t a a bt Thread 4 (process 31153): #0 0x0012d402 in __kernel_vsyscall () #1 0x0063fc07 in sigsuspend () from /lib/libc.so.6 #2 0x0044c962 in isc_app_run () at app.c:533 #3 0xb7f9496a in main (argc=0, argv=0xbfa3e2c4) at ./main.c:878 Thread 3 (process 31155): #0 0x0012d402 in __kernel_vsyscall () #1 0x00494902 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/ libpthread.so.0 #2 0x0045eddc in isc_condition_waituntil (c=0xb7f1a040, m=0xb7f1a010, t=0xb7f1a038) at condition.c:59 #3 0x0044afc0 in run (uap=0xb7f1a008) at timer.c:719 #4 0x0049050b in start_thread () from /lib/libpthread.so.0 #5 0x006e9b2e in clone () from /lib/libc.so.6 Thread 2 (process 31156): #0 0x0012d402 in __kernel_vsyscall () #1 0x006e25f1 in select () from /lib/libc.so.6 #2 0x0045c42d in watcher (uap=0xb8c10128) at socket.c:2538 #3 0x0049050b in start_thread () from /lib/libpthread.so.0 #4 0x006e9b2e in clone () from /lib/libc.so.6 Thread 1 (process 31154): #0 0x0015ae27 in dns_acl_match (reqaddr=0xb7f101d4, reqsigner=0x0, acl=0xb60f3248, env=0xb7f1e0a8, match=0xb7f101d0, matchelt=0x0) at acl.c:226 #1 0xb7f84f39 in ns_client_checkaclsilent (client=0xb5e04008, sockaddr=0x0, acl=0xb60f3248, default_allow=isc_boolean_true) at client.c:2499 #2 0xb7f95ed7 in query_validatezonedb (client=0xb5e04008, name=0xb5e0b008, qtype=15, options=0, zone=0xb8daa4e0, db=0xb5ac71d8, versionp=0xb7f10a0c) at query.c:643 #3 0xb7f9641e in query_getzonedb (client=0xb5e04008, name=0xb5e0b008, qtype=15, options=0, zonep=0xb7f10a08, dbp=0xb7f10a44, versionp=0xb7f10a0c) at query.c:742 #4 0xb7f96487 in query_getdb (client=0xb5e04008, name=0xb5e0b008, qtype=<value optimized out>, options=0, zonep=0xb7f10a08, dbp=0xb7f10a44, versionp=0xb7f10a0c, is_zonep=0xb7f10a18) at query.c:940 #5 0xb7f9a469 in query_find (client=0xb5e04008, event=0x0, qtype=15) at query.c:3456 #6 0xb7f9d9aa in ns_query_start (client=0xb5e04008) at query.c:4617 #7 0xb7f8a80f in client_request (task=0xb7f2aa98, event=0xb5e080f8) at client.c:1783 #8 0x00448822 in run (uap=0xb7f18008) at task.c:874 Missing separate debuginfos, use: debuginfo-install e2fsprogs.i386 glibc.i686 keyutils.i386 krb5.i386 libcap.i386 libselinux.i386 libxml2.i386 openssl.i686 zlib.i386 ---Type <return> to continue, or q <return> to quit--- #9 0x0049050b in start_thread () from /lib/libpthread.so.0 #10 0x006e9b2e in clone () from /lib/libc.so.6
Problem with 9.5.0-26.b3 is tracked as bug #446848. I added all impacted persons to CC of that bug.