Bug 517265

Summary: Critical Bind sdb Problem (segfault and unstable) with version 9.5.1-P3 on redhat5.3
Product: Red Hat Enterprise Linux 5 Reporter: DCLUX sysadmin <sysadmin>
Component: bindAssignee: Adam Tkac <atkac>
Status: CLOSED CANTFIX QA Contact: BaseOS QE <qe-baseos-auto>
Severity: urgent Docs Contact:
Priority: low    
Version: 5.3CC: ovasik
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 12:05:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
SRPM use to complie bind on redhat 5 none

Description DCLUX sysadmin 2009-08-13 10:25:08 UTC
Description of problem:

We encounter some trouble with one of our server on bind-9.5.1-3.P3.
We have take the SRPMS on the old isc website upload by atkac.

We have updated all our ns and dns server with bind and bind_sdb compile with this SRPMS, but one of our ns doesn’t support the load on it.

This ns is a fresh install on 5.3 instead of the other ns, we encounter only this problem with all fresh install ns server. Old ns servers are based on 5.0 with update to 5.3.

All our ns servers are on xen solution but we have made test with physical server it is the same probleme.

We have a lot of message like that : 
mysql driver unable to return result set for findzone query

we have check the mysql server, the connection with the mysql user and all mysql stuff seems to be ok
we have activate debug 3 log on named and we have not return form mysql and on the mysql log queries we have no queries.

And sometime, we have a crash of the named_sdb process with error on the men.c file or segfault. We have some core file.

Version-Release number of selected component (if applicable):

bind-9.5.1-3.P3
bind-sdb-9.5.1-3.P3

How reproducible:

Install a new xen server or physical in 5.3
Install package bind bind-sdb bind-utils
Activate sdb on the named.conf file


Steps to Reproduce:
1.Install a new xen server of physical
2.Install package bind bind-sdb bind-utils
3.Activate sdb on the named.conf file
4.interrogate with a lot of queries the named server with zone on the database.
  
Actual results:

the first queries answers but after 5 or 10 queries we have "mysql driver unable to return result set for findzone query"


Expected results:

a good result

Additional info:

our old version of bind is 9.4.2 very stable on our paltforme.
we have try to reinstall the server we have the same probleme.

best regards

and thanks a lot for your help

Comment 1 DCLUX sysadmin 2009-08-13 11:01:53 UTC
My config file :

acl "bogon" {
    // Filter out the bogon networks.  These are networks
    // listed by IANA as test, RFC1918, Multicast, experi-
    // mental, etc.  If you see DNS queries or updates with
    // a source address within these networks, this is likely
    // of malicious origin. CAUTION: If you are using RFC1918
    // netblocks on your network, remove those netblocks from
    // this list of blackhole ACLs!
    0.0.0.0/8;
    1.0.0.0/8;
    2.0.0.0/8;
    5.0.0.0/8;
    10.0.0.0/8;
    14.0.0.0/8;
    23.0.0.0/8;
    27.0.0.0/8;
    31.0.0.0/8;
    36.0.0.0/8;
    37.0.0.0/8;
    39.0.0.0/8;
    42.0.0.0/8;
    46.0.0.0/8;
    49.0.0.0/8;
    50.0.0.0/8;
    100.0.0.0/8;
    101.0.0.0/8;
    102.0.0.0/8;
    103.0.0.0/8;
    104.0.0.0/8;
    105.0.0.0/8;
    106.0.0.0/8;
    107.0.0.0/8;
    108.0.0.0/8;
    109.0.0.0/8;
    110.0.0.0/8;
    111.0.0.0/8;
    169.254.0.0/16;
    172.16.0.0/12;
    175.0.0.0/8;
    176.0.0.0/8;
    177.0.0.0/8;
    178.0.0.0/8;
    179.0.0.0/8;
    180.0.0.0/8;
    181.0.0.0/8;
    182.0.0.0/8;
    183.0.0.0/8;
    184.0.0.0/8;
    185.0.0.0/8;
    192.0.2.0/24;
//    192.168.0.0/16;
    198.18.0.0/15;
    223.0.0.0/8;
    224.0.0.0/3;
}; 

options
{
        directory "/var/named";
	statistics-file "/var/log/named/named.stats";
	zone-statistics yes;
	
	// Generate more efficient zone transfers.  This will place
	// multiple DNS records in a DNS message, instead of one per
	// DNS message.
	transfer-format many-answers;

	// Set the maximum zone transfer time to something more
	// reasonable.  In this case, we state that any zone transfer
	// that takes longer than 60 minutes is unlikely to ever
	// complete.  WARNING:  If you have very large zone files,
	// adjust this to fit your requirements.
	max-transfer-time-in 60;

	// We have no dynamic interfaces, so BIND shouldn't need to
	// poll for interface state {UP|DOWN}.
	interface-interval 0;

	blackhole {
		// Deny anything from the bogon networks as
		// detailed in the "bogon" ACL.
		bogon;
	};

	listen-on { any; };
	listen-on-v6 { none; };

        allow-query { any; };

	tcp-clients 100;
	recursive-clients 1000;

	// dig version.bind txt chaos
	version "Secured";

	// memory usage limitation
	//max-cache-size 256m; 
        //datasize 300m;

        recursion no;
        additional-from-auth no;
        additional-from-cache no;
};

key "rndc-key" {
        algorithm hmac-md5;
	secret "*********************";
};

controls {
	inet 127.0.0.1 port 954	allow { 127.0.0.1; } keys {"rndc-key";};
};

logging {
	channel default_syslog {
		// Send most of the named messages to syslog.
		syslog local2;
		severity info; 
	};

	channel audit_log {

		// Send the security related messages to a separate file.
		file "data/audit.log" versions 3 size 5m;
		severity debug 10;
		print-time yes; 
		print-category yes;
		print-severity no;
	}; 

	category default { default_syslog; audit_log;};
	category general { default_syslog; audit_log;};
	category security { default_syslog; audit_log;};
	category config { default_syslog; audit_log;};
	category resolver { audit_log; };
	category lame-servers { null; };
};


view "localview" {
        match-clients { 127.0.0.1; *.*.*.98; *.*.*.39; *.*.*.40; *.*.*.155; };

	dlz "Mysql zone" {
		database "mysql
		{host=localhost dbname=dbname user=dbuser pass=dbpass}
		{select distinct zone from dns_records where zone = if((select count(*) from dns_records where type='SOA' and zone=concat('local_', '%zone%')), concat('local_', '%zone%'),'%zone%')}
		{select ttl, type, mx_priority, 
			case 
				when lower(type)='txt' then concat('\"', data, '\"')
				else data
			end 
			from dns_records
			inner join dns_zone ON dns_records.dns_zoneID = dns_zone.ID
			where zone = if((select count(*) from dns_records where type='SOA' and zone=concat('local_', '%zone%')), concat('local_', '%zone%'),'%zone%') and host = '%record%'
				and not (type = 'SOA' or type = 'NS')}
		{select ttl, type, mx_priority, data, resp_person, serial, refresh, retry, expire, minimum
			from dns_records where zone = if((select count(*) from dns_records where type='SOA' and zone=concat('local_', '%zone%')), concat('local_', '%zone%'),'%zone%') and (type = 'SOA' or type='NS')}
		{select ttl, type, host, mx_priority, if(lower(type)='txt',concat('\"', data, '\"'),data), resp_person, serial, refresh, retry, expire,
			minimum from dns_records where zone = if((select count(*) from dns_records where type='SOA' and zone=concat('local_', '%zone%')), concat('local_', '%zone%'),'%zone%') and not (type='SOA')}
		{select zone from xfr_table where zone = if((select count(*) from dns_records where type='SOA' and zone=concat('local_', '%zone%')), concat('local_', '%zone%'),'%zone%') and client = '%client%'}";
	};

	include "/etc/bind/commonzones.conf";
	include "/etc/bind/forwarders.conf";
};

view "default" {
        match-clients { "any"; };

        include "/etc/bind/dlzzone.conf";
        include "/etc/bind/commonzones.conf";
	include "/etc/bind/forwarders.conf";
};


Fichier /etc/bind/dlz.conf

        dlz "Mysql zone" {
                database "mysql
                {host=localhost dbname=dbname user=dbuser pass=dbpass}
                {select distinct zone from dns_records where zone = '%zone%' and substring(zone,1,6) <> 'local_'}
                {select ttl, type, mx_priority,
                        case
				when lower(type)='txt' then concat('\"', data, '\"')
				when lower(type)='a' and dns_zone.disabled is not null then '80.92.66.130'
                                when lower(type)='mx' and dns_zone.disabled is not null then 'disabled.eurodns.com.'
                                when lower(type)='cname' and dns_zone.disabled is not null then 'disabled.eurodns.com.'
                                else data
                        end
                        from dns_records
                        inner join dns_zone ON dns_records.dns_zoneID = dns_zone.ID
                        where zone = '%zone%' and host = '%record%'
                                and not (type = 'SOA' or type = 'NS')}
                {select ttl, type, mx_priority, data, resp_person, serial, refresh, retry, expire, minimum
                        from dns_records where zone = '%zone%' and (type = 'SOA' or type='NS')}
                {select ttl, type, host, mx_priority, if(lower(type)='txt',concat('\"', data, '\"'),data), resp_person, serial, refresh, retry, expire,
                        minimum from dns_records where zone = '%zone%' and not (type='SOA') AND dns_zoneID>0}
                {select '%zone%' from xfr_table where (zone = '%zone%' and client = '%client%')
			or ('%client%' in ('*.*.*.105','*.*.*.108','*.*.*.120') and '%zone%' REGEXP '\.vn$')}";
        };



My log file :

Aug 13 12:29:37 localhost named-sdb[3569]: starting BIND 9.5.1-P3-RedHat-9.5.1-3.P3 -u named Aug 13 12:29:37 localhost named-sdb[3569]: adjusted limit on open files from 1024 to 1048576 Aug 13 12:29:37 localhost named-sdb[3569]: found 4 CPUs, using 4 worker threads Aug 13 12:29:37 localhost named-sdb[3569]: using up to 4096 sockets Aug 13 12:29:37 localhost named-sdb[3569]: SDB ldap zone database module loaded.
Aug 13 12:29:37 localhost named-sdb[3569]: SDB postgreSQL DB zone database module loaded.
Aug 13 12:29:37 localhost named-sdb[3569]: SDB sqlite3 DB zone database module loaded.
Aug 13 12:29:37 localhost named-sdb[3569]: SDB directory DB zone database module loaded.
Aug 13 12:29:37 localhost named-sdb[3569]: loading configuration from '/etc/named.conf'
Aug 13 12:29:37 localhost named-sdb[3569]: using default UDP/IPv4 port range: [1024, 65535] Aug 13 12:29:37 localhost named-sdb[3569]: using default UDP/IPv6 port range: [1024, 65535] Aug 13 12:29:37 localhost named-sdb[3569]: listening on IPv4 interface lo, 127.0.0.1#53 Aug 13 12:29:37 localhost named-sdb[3569]: listening on IPv4 interface eth0, 80.92.65.97#53 Aug 13 12:29:37 localhost named-sdb[3569]: Loading 'Mysql zone' using driver mysql Aug 13 12:29:37 localhost named-sdb[3569]: Loading 'Mysql zone' using driver mysql Aug 13 12:29:37 localhost named-sdb[3569]: command channel listening on 127.0.0.1#954 Aug 13 12:29:37 localhost named-sdb[3569]: the working directory is not writable Aug 13 12:29:37 localhost named-sdb[3569]: data/all.zone:11: file does not end with newline Aug 13 12:29:37 localhost named-sdb[3569]: zone ./IN/localview: loaded serial 2007061204 Aug 13 12:29:37 localhost named-sdb[3569]: zone 127.in-addr.arpa/IN/localview: loaded serial 2002081601 Aug 13 12:29:37 localhost named-sdb[3569]: zone localhost/IN/localview: loaded serial 2002081601 Aug 13 12:29:37 localhost named-sdb[3569]: data/all.zone:11: file does not end with newline Aug 13 12:29:37 localhost named-sdb[3569]: zone ./IN/default: loaded serial 2007061204 Aug 13 12:29:37 localhost named-sdb[3569]: zone 127.in-addr.arpa/IN/default: loaded serial 2002081601 Aug 13 12:29:37 localhost named-sdb[3569]: zone localhost/IN/default: loaded serial 2002081601 Aug 13 12:29:37 localhost named-sdb[3569]: running Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for findzone query Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for lookup query Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for authority query Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for lookup query Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for lookup query Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for authority query Aug 13 12:29:40 localhost named-sdb[3569]: mysql driver unable to return result set for findzone query Aug 13 12:29:44 localhost last message repeated 356 times Aug 13 12:29:44 localhost kernel: named-sdb[3570]: segfault at 0000000000000000 rip 00002ab52a09b560 rsp 00000000428498b8 error 4


Best regards

Comment 2 DCLUX sysadmin 2009-08-24 06:04:27 UTC
Hi,

We have now the same probleme of mysql driver unable to return result
set for findzone query on other servers.

We are looking to go back on bind 9.4 because we have to much probleme with the 9.5.1 on redhat 5.3. 

But we have some problems to find a good SPEC files for bind.9.4.3.

It is possible for you to provide us a SPEC files ou a SRPMS.

Best regards

Thanks again for your help.

Comment 3 Adam Tkac 2009-09-03 08:51:17 UTC
(In reply to comment #2)
> Hi,
> 
> We have now the same probleme of mysql driver unable to return result
> set for findzone query on other servers.
> 
> We are looking to go back on bind 9.4 because we have to much probleme with the
> 9.5.1 on redhat 5.3. 
> 
> But we have some problems to find a good SPEC files for bind.9.4.3.

Are you sure that 9.4 series worked fine for you? This issue look like generic issue which should affect both 9.4 and 9.5 series.

> It is possible for you to provide us a SPEC files ou a SRPMS.

Yes, you could find the Fedora 7 SRPM on http://archives.fedoraproject.org/pub/archive/fedora/linux/updates/7/SRPMS/bind-9.4.2-4.fc7.src.rpm. It should be fairly simple to use it as base for 9.4.3.

All DLZ drivers (as MySQL DLZ driver you are currently using) are not designed to heavy load. They should be used in case you have a _huge_ zone but the master server is under low load. You can check http://bind-dlz.sourceforge.net/perf_tests.html if you are interested - BIND is able to handle about 25x - 30x bigger number of requests when you are using text zones instead of MySQL DLZ backend.

Note that customized packages are not supported on RHEL5. If you would like to use newer BIND releases the best idea is to use Fedora.

Comment 4 DCLUX sysadmin 2009-09-03 12:54:41 UTC
Hi,

We have an old 9.4.2 and this version with DLZ working fine without any error or trouble, very stable. We have something like more 300000 Zones and load balancing base on 2 servers for each NS entries  and something like 25000 query per minute on each server.


We have a load between 0.5 and 0.7 and less than 30% of cpu usage on each loadbalanced server.

                    -----------
                    ¦ queries ¦
                    -----------
      -                  |
      |              -----------
      |              ¦   LB    ¦
      |              -----------
 NSX  |             /           \
      |  -----------            -----------
      |  ¦ NSX-1   ¦            ¦  NSX-1  ¦
      -  -----------            -----------

               schema : our ns archi


i have check the page about the perf with mysql and bind and i understood the fact we could have a limitation. 

But in our case, i have only one server with this kind of problem, and the other ns server in relation with this one take all the load and doesn't have this kind of error or result and it is the same configuration.

1CPU 1GO of ram same OS redhat 5.3 with xen kernel. The only difference is the date of installation with 5.1 base install for a good server and 5.2 or 5.3 for a server with trouble. This thing is only available when we used a 9.5.1 bind-sdb package and it is my last supposition.

I hope my explanation is clear. 

I have provided the SPRMS in attachment.

>Note that customized packages are not supported on RHEL5. If you would like to
>use newer BIND releases the best idea is to use Fedora.

Could you please informe us about the Bind version provided by redhat on RHEL6.
And for the moment redhat support only bind-9.3 on redhat 5, this version working fine but she is not interesting  instead of 9.5 or 9.4 and also end of live since january 2009. 

https://www.isc.org/software/bind/versions.

Use fedora could be an idea but all our system is base on redhat (ns, virtualisation, db server, redhat satellite, etc ...). Fedora core change to fast and we need stable product.

Thanks for your answers and looking to your swift reply 

Fabien FAYE

Comment 5 DCLUX sysadmin 2009-09-03 12:56:12 UTC
Created attachment 359677 [details]
SRPM use to complie bind on redhat 5

Comment 6 Adam Tkac 2010-03-30 12:05:30 UTC
Unfortunately BIND 9.5 series is not supported on RHEL 5 so this issue won't be fixed.

There is request for technology preview of BIND 9.7 series for RHEL 5, check bug #570611 if you are interested. You can add yourself to CC to check progress. I believe this version will fix your problems.

Closing.