Bug 735103 - dhcpd: failover: link startup timeout
dhcpd: failover: link startup timeout
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: bind (Show other bugs)
15
All Unspecified
high Severity high
: ---
: ---
Assigned To: Adam Tkac
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-01 09:45 EDT by Rick Murphy
Modified: 2013-04-30 19:50 EDT (History)
3 users (show)

See Also:
Fixed In Version: dnsperf-1.0.1.0-25.fc16
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-09-14 18:30:05 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Rick Murphy 2011-09-01 09:45:53 EDT
Description of problem:
I have recently upgraded my DHCP servers to Fedora 15. They are a failover pair operating between two systems now running Fedora 15. After the upgrade, the failover initialization fails, spitting out the following every 20 seconds:
dhcpd: failover: link startup timeout

This happens on both of the failover peers. I build ISC dhcpd 4.2.2 from source and replaced the dhcpd binary with the newly built one and the errors ceased. This allowed the server pair to properly initialize failover. 

Version-Release number of selected component (if applicable):
isc-dhcpd-4.2.1-P1

How reproducible:
Completely.

Steps to Reproduce:
1. Configure a failover pair according to the manpage
2. Start both dhcp servers
3. View the syslog
  
Actual results:
dhcpd: failover: link startup timeout
Peers do not enter 'normal' communications state

Expected results:
Normal communications.

Additional info:
Comment 1 Jiri Popelka 2011-09-05 07:11:45 EDT
Thank you for the report.
Would it be possible to rebuild the source RPM we have in Fedora 16 ?
It is also 4.2.2 but this way all our patches will be applied so we'll see
if the problem is in some of our patches or in 4.2.1.

1) Download
http://kojipkgs.fedoraproject.org/packages/dhcp/4.2.2/1.fc16/src/dhcp-4.2.2-1.fc16.src.rpm
2) su -c 'yum install rpmdevtools'
3) rpmdev-setuptree
4) rpmbuild --rebuild dhcp-4.2.2-1.fc16.src.rpm
5) cd ~/rpmbuild/RPMS/<your arch>
6) su -c 'yum --nogpgcheck localupdate *.rpm'
You can always downgrade with 'yum downgrade dhcp'

Or eventually you can try to build ISC dhcpd 4.2.1 (as you already did with 4.2.2) from source and try it. That will also show us on which side (ISC/Fedora) the problem is.

Thanks
Comment 2 Jiri Popelka 2011-09-05 09:13:43 EDT
I was able to reproduce the problem so you can ignore the previous commit.
Comment 3 Rick Murphy 2011-09-06 07:59:05 EDT
I did attempt the build of the fc16 source, but there's other dependencies that keep it from compiling:

In file included from ../includes/omapip/isclib.h:64:0,
                 from ../includes/dhcpd.h:95,
                 from bpf.c:35:
/usr/include/dns/client.h:146:19: error: unknown type name 'dns_client_t'
/usr/include/dns/client.h:149:37: error: unknown type name 'isc_appctx_t'
/usr/include/dns/client.h:151:28: error: unknown type name 'dns_client_t'

I suspect that the bind-lite-devel package needs to be updated as well, but I'll hold off on trying further now that you've reproduced it.
Comment 4 Jiri Popelka 2011-09-06 09:12:00 EDT
I see also this repeating message in log:
../../../../lib/isc/unix/socket.c:891: epoll_ctl(DEL), 10: Bad file descriptor

this seems serious, because the OMAPI (omshell tool) is also not working, see
http://lists.fedoraproject.org/pipermail/users/2011-August/402745.html

That message comes from BIND, so I'm adding BIND maintainer to CC.
Adam, does it ring a bell to you ?
The easiest way how to reproduce that message is just run 'omshell' and type 'connect' command.

So far it seems that the problem is in Fedora's change (bug #637017) in dhcp which allows us (since F15) to use system BIND libraries instead of bundled BIND libraries from dhcp sources.
When I build (F15 branch) dhcp without those 2 patches (rh637017.patch, sharedlib.patch) everything (failover, OMAPI) works as expected.

I'm still investigating it.
Comment 5 Adam Tkac 2011-09-07 12:22:33 EDT
Reassigning to bind, this seems like bind-libs-lite issue for me.
Comment 6 Fedora Update System 2011-09-09 07:29:42 EDT
dnsperf-1.0.1.0-25.fc16,dhcp-4.2.2-5.fc16,bind-dyndb-ldap-1.0.0-0.2.b1.fc16,bind-9.8.1-2.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/dnsperf-1.0.1.0-25.fc16,dhcp-4.2.2-5.fc16,bind-dyndb-ldap-1.0.0-0.2.b1.fc16,bind-9.8.1-2.fc16
Comment 7 Fedora Update System 2011-09-09 07:31:37 EDT
bind-9.8.1-1.fc15,bind-dyndb-ldap-1.0.0-0.2.b1.fc15,dhcp-4.2.1-11.P1.fc15,dnsperf-1.0.1.0-25.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/bind-9.8.1-1.fc15,bind-dyndb-ldap-1.0.0-0.2.b1.fc15,dhcp-4.2.1-11.P1.fc15,dnsperf-1.0.1.0-25.fc15
Comment 8 Fedora Update System 2011-09-09 11:09:16 EDT
Package dnsperf-1.0.1.0-25.fc16, dhcp-4.2.2-5.fc16, bind-dyndb-ldap-1.0.0-0.2.b1.fc16, bind-9.8.1-2.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing dnsperf-1.0.1.0-25.fc16 dhcp-4.2.2-5.fc16 bind-dyndb-ldap-1.0.0-0.2.b1.fc16 bind-9.8.1-2.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/dnsperf-1.0.1.0-25.fc16,dhcp-4.2.2-5.fc16,bind-dyndb-ldap-1.0.0-0.2.b1.fc16,bind-9.8.1-2.fc16
then log in and leave karma (feedback).
Comment 9 Rick Murphy 2011-09-11 22:05:01 EDT
I'm currently running Fedora 15. I've tried the suggested update command and have a "No match for argument .." for each of the suggested updates.
Changing fc16 to fc15 doesn't find any update packages either.

If you'll push these updates to the Fedora 15 testing repository, I'll give them a try. Otherwise, I'll wait until Fedora 16 release and assume the fixes will be incorporated.
Comment 10 Adam Tkac 2011-09-12 05:32:46 EDT
(In reply to comment #9)
> I'm currently running Fedora 15. I've tried the suggested update command and
> have a "No match for argument .." for each of the suggested updates.
> Changing fc16 to fc15 doesn't find any update packages either.
> 
> If you'll push these updates to the Fedora 15 testing repository, I'll give
> them a try. Otherwise, I'll wait until Fedora 16 release and assume the fixes
> will be incorporated.

It takes some time (one day) before all updates are propagated to mirrors (push->updates are on master server->updates are on mirrors). Today I was able to fetch updated bind-* and dhcp-* packages via command written in comment #7.
Comment 11 Rick Murphy 2011-09-14 09:18:48 EDT
Updates installed and the problem is fixed. Comments left as requested.
Thanks, Adam.
Comment 12 Fedora Update System 2011-09-14 18:29:49 EDT
bind-9.8.1-1.fc15, bind-dyndb-ldap-1.0.0-0.2.b1.fc15, dhcp-4.2.1-11.P1.fc15, dnsperf-1.0.1.0-25.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 13 Fedora Update System 2011-09-23 00:02:23 EDT
dnsperf-1.0.1.0-25.fc16, dhcp-4.2.2-5.fc16, bind-dyndb-ldap-1.0.0-0.2.b1.fc16, bind-9.8.1-2.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.