Bug 878139 - [abrt] bind-utils-9.9.2-2.fc17: next_origin: Process /usr/bin/nslookup was killed by signal 11 (SIGSEGV)
[abrt] bind-utils-9.9.2-2.fc17: next_origin: Process /usr/bin/nslookup was ki...
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: bind (Show other bugs)
17
x86_64 Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Tomas Hozza
Fedora Extras Quality Assurance
abrt_hash:06e62dadd1325d9a82c7515b3bc...
:
: 919637 919710 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-19 12:53 EST by Anatolii Vorona
Modified: 2013-06-17 21:36 EDT (History)
5 users (show)

See Also:
Fixed In Version: dhcp-4.2.5-2.fc17
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-06-17 21:29:11 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
File: core_backtrace (568 bytes, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: environ (2.89 KB, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: backtrace (11.17 KB, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: limits (1.29 KB, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: cgroup (128 bytes, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: smolt_data (3.25 KB, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: executable (17 bytes, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: maps (11.52 KB, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: dso_list (2.36 KB, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: proc_pid_status (923 bytes, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: var_log_messages (303 bytes, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details
File: open_fds (217 bytes, text/plain)
2012-11-19 12:53 EST, Anatolii Vorona
no flags Details

  None (edit)
Description Anatolii Vorona 2012-11-19 12:53:29 EST
Description of problem:

#!/bin/bash
alphabet="d s c r w k l m"
dzone="7.net"
for x in $alphabet
        do
	for y in $alphabet
                do
                for z in $alphabet
                        do
                        echo "$x$y$z$dzone $(nslookup $x$y$z$dzone | grep '*\|Address: ' ) "
                        done
                done
        done


Version-Release number of selected component:
bind-utils-9.9.2-2.fc17

Additional info:
libreport version: 2.0.18
abrt_version:   2.0.18
backtrace_rating: 4
cmdline:        nslookup ckk7.net
crash_function: next_origin
kernel:         3.6.6-1.fc17.x86_64

truncated backtrace:
:Thread no. 1 (4 frames)
: #0 next_origin at dighost.c:1914
: #1 connect_timeout at dighost.c:2712
: #2 dispatch at task.c:1116
: #3 run at task.c:1286
Comment 1 Anatolii Vorona 2012-11-19 12:53:32 EST
Created attachment 647905 [details]
File: core_backtrace
Comment 2 Anatolii Vorona 2012-11-19 12:53:34 EST
Created attachment 647906 [details]
File: environ
Comment 3 Anatolii Vorona 2012-11-19 12:53:36 EST
Created attachment 647907 [details]
File: backtrace
Comment 4 Anatolii Vorona 2012-11-19 12:53:38 EST
Created attachment 647908 [details]
File: limits
Comment 5 Anatolii Vorona 2012-11-19 12:53:40 EST
Created attachment 647909 [details]
File: cgroup
Comment 6 Anatolii Vorona 2012-11-19 12:53:42 EST
Created attachment 647910 [details]
File: smolt_data
Comment 7 Anatolii Vorona 2012-11-19 12:53:44 EST
Created attachment 647911 [details]
File: executable
Comment 8 Anatolii Vorona 2012-11-19 12:53:47 EST
Created attachment 647912 [details]
File: maps
Comment 9 Anatolii Vorona 2012-11-19 12:53:49 EST
Created attachment 647913 [details]
File: dso_list
Comment 10 Anatolii Vorona 2012-11-19 12:53:51 EST
Created attachment 647914 [details]
File: proc_pid_status
Comment 11 Anatolii Vorona 2012-11-19 12:53:53 EST
Created attachment 647915 [details]
File: var_log_messages
Comment 12 Anatolii Vorona 2012-11-19 12:53:55 EST
Created attachment 647916 [details]
File: open_fds
Comment 13 Fedora Admin XMLRPC Client 2013-04-25 07:38:04 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 14 Tomas Hozza 2013-05-15 14:56:45 EDT
*** Bug 919637 has been marked as a duplicate of this bug. ***
Comment 15 Tomas Hozza 2013-05-15 15:01:34 EDT
*** Bug 919710 has been marked as a duplicate of this bug. ***
Comment 16 Tomas Hozza 2013-05-17 03:06:51 EDT
The issue is caused by a mistake in the patch added some time ago [1].

I appears that timing is critical for this issue to occur. host/nslookup sends
a UDP DNS QUERY and starts a timer with timeout. The timer handler is
"connect_timeout()". When answer is received "recv_done()" is called in which
in the end "clear_query(query)" is called. In some circumstances the "query"
passed is the lookup->current_query in which case the lookup->current_query
freed and set to NULL. Well and now it gets interesting when the timeout runs
out and the "connect_timeout()" is called.

<snip from connect_timeout()>
...
l = event->ev_arg;
query = l->current_query;  /* this is NULL */
...
<snip>
...
} else {
		fputs(l->cmdline, stdout);
		if (!next_origin(query))) {    /* <- query is NULL */
			printf(";; connection timed out; no servers could be "
			       "reached\n");
		} else {
			printf(";; connection timed out; trying next "
			       "origin\n");
		}
...

But there are situations when timeout handler is called before the current_query
is freed and set to NULL and then it works.

The lookup structure is protected by mutex. So it looks that the issue depends on
how the system schedules threads and which locks the lookup structure first.

But this is expected behaviour (I think) since the "connection_timeout()"
checks if query (current_query) is NULL. And later on when retrying to send
queries once more ISC_LIST_HEAD(l->q) is used instead of the current_query.
l->q is a list of queries and when it's empty, the whole lookup structure
is destroyed and also a timer if there is any. So there should not be
a situation when timeout handler is called and the queries list is empty.

From what I tested, using "ISC_LIST_HEAD(l->q)" instead of "query" when calling
next_origin() works well. It is also good to mention that for the next_origin()
function it is irrelevant with which query it is called. The parameter is used
only to get to the "parent" lookup structure pointer which is the same for all
queries in the list.

[1] http://lists.fedoraproject.org/pipermail/scm-commits/2011-October/677202.html
Comment 17 Tomas Hozza 2013-05-17 04:20:52 EDT
Fixed in:
bind-9.9.3-0.7.rc2.fc20
bind-9.9.3-0.7.rc2.fc19
bind-9.9.2-12.P2.fc18
bind-9.9.2-8.P2.fc17
Comment 18 Ville Skyttä 2013-05-22 04:22:32 EDT
(In reply to Tomas Hozza from comment #17)
> bind-9.9.2-12.P2.fc18

It seems that at least for this, only a koji build exists but no update has been submitted, is that on purpose?
Comment 19 Tomas Hozza 2013-05-22 05:52:38 EDT
(In reply to Ville Skyttä from comment #18)
> (In reply to Tomas Hozza from comment #17)
> > bind-9.9.2-12.P2.fc18
> 
> It seems that at least for this, only a koji build exists but no update has
> been submitted, is that on purpose?

This is true. I'm waiting for bind-9.9.3 to be released to push an update in
bodhi. Currently there is 9.9.3rc2. So this is intentional and therefore the
Bug status is MODIFIED and not ON_QA.
Comment 20 Fedora Update System 2013-06-03 15:48:42 EDT
bind-dyndb-ldap-2.6-2.fc18,dnsperf-2.0.0.0-4.fc18,dhcp-4.2.5-12.fc18,bind-9.9.3-2.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/bind-dyndb-ldap-2.6-2.fc18,dnsperf-2.0.0.0-4.fc18,dhcp-4.2.5-12.fc18,bind-9.9.3-2.fc18
Comment 21 Fedora Update System 2013-06-03 15:52:19 EDT
dhcp-4.2.5-2.fc17,dnsperf-2.0.0.0-3.fc17,bind-dyndb-ldap-2.5-2.fc17,bind-9.9.3-2.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/dhcp-4.2.5-2.fc17,dnsperf-2.0.0.0-3.fc17,bind-dyndb-ldap-2.5-2.fc17,bind-9.9.3-2.fc17
Comment 22 Fedora Update System 2013-06-05 21:29:01 EDT
Package dhcp-4.2.5-2.fc17, dnsperf-2.0.0.0-3.fc17, bind-dyndb-ldap-2.5-2.fc17, bind-9.9.3-3.P1.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing dhcp-4.2.5-2.fc17 dnsperf-2.0.0.0-3.fc17 bind-dyndb-ldap-2.5-2.fc17 bind-9.9.3-3.P1.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-10100/dhcp-4.2.5-2.fc17,dnsperf-2.0.0.0-3.fc17,bind-dyndb-ldap-2.5-2.fc17,bind-9.9.3-3.P1.fc17
then log in and leave karma (feedback).
Comment 23 Fedora Update System 2013-06-17 21:29:11 EDT
bind-dyndb-ldap-2.6-2.fc18, dnsperf-2.0.0.0-4.fc18, dhcp-4.2.5-12.fc18, bind-9.9.3-3.P1.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 24 Fedora Update System 2013-06-17 21:36:32 EDT
dhcp-4.2.5-2.fc17, dnsperf-2.0.0.0-3.fc17, bind-dyndb-ldap-2.5-2.fc17, bind-9.9.3-3.P1.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.