Bug 1768604 - Reboot hangs forever shutting down dhcp
Summary: Reboot hangs forever shutting down dhcp
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: dhcp
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Pavel Zhukov
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-04 19:29 UTC by Tom Horsley
Modified: 2020-07-10 05:05 UTC (History)
3 users (show)

Fixed In Version: dhcp-4.4.1-19.fc31
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-10 05:05:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
all dhcpd messages from log (27.20 KB, text/plain)
2019-11-07 13:29 UTC, Tom Horsley
no flags Details
the dhcpd.conf file (408 bytes, text/plain)
2019-11-07 13:32 UTC, Tom Horsley
no flags Details
strace of dhcpd -f (2.69 MB, text/plain)
2020-01-07 19:03 UTC, Tom Horsley
no flags Details

Description Tom Horsley 2019-11-04 19:29:25 UTC
Description of problem:

When I reboot, I get a message about start job running to shut down dhcpv4 (something like that anyway). It never seems to shutdown. Holding down Ctrl-Alt-Del claims it is rebooting right away, but still hangs forever.

If I try "systemctl stop dhcpd" in a running system, it also hangs
for a long time, and I see this in the process list:

 7060 pts/1    S+     0:00                  \_ systemctl stop dhcpd
 7061 pts/1    S+     0:00                      \_ /usr/bin/systemd-tty-ask-password-agent --watch

I ran that as root - there is no reason it should need to ask anyone for any kind of a password, and nothing asking for one appeared in the terminal where I ran it. (It does eventually timeout while I'm doing stuff like typing in this bug report).


Version-Release number of selected component (if applicable):
dhcp-server-4.4.1-15.fc31.x86_64


How reproducible:
Appears to happen 100% of the time.

Steps to Reproduce:
1.start dhcpd
2.try to stop dhcpd
3.

Actual results:
Long hang

Expected results:
Stops right away

Additional info:

Comment 1 Pavel Zhukov 2019-11-07 08:59:03 UTC
Please attach some logs.
"journalctl -u dhcpd" at least 
Are you able to stop dhcpd manuály with systemctl stop dhcpd command?

Comment 2 Tom Horsley 2019-11-07 13:29:37 UTC
Created attachment 1633621 [details]
all dhcpd messages from log

All I see are normal messages followed by dhcpd starting back up again. Nothing logged from times where I tried to shut it down.

Comment 3 Tom Horsley 2019-11-07 13:32:29 UTC
Created attachment 1633622 [details]
the dhcpd.conf file

I'm running dhcp to provide info to my phone and tablet and such for an access point I run using hostapd.

Comment 4 Tom Horsley 2019-11-07 13:40:40 UTC
I mention in the original description what I see when I "systemctl stop dhcpd" while the system is running. Systemctl seems to be trying to ask for a password and takes a long time to timeout while doing that. If I let it sit long enough, it finally times out but it seems like it takes a very long time. (And I'm running the stop as root, so I have no idea why it is asking for a password, or even where it is asking for a password since I see no password prompt anywhere).

If I "killall -9 dhcpd" it stops right away :-). Then I can reboot with no hang.

I see several .service files with a KillMode=<something> in them, but see no KillMode in the dhcpd.service file, but I can't deduce any consistent usage of KillMode, so I have no idea if that is relevant.

Comment 5 Fedora Update System 2019-11-07 14:25:35 UTC
FEDORA-2019-8f16a6bfc6 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2019-8f16a6bfc6

Comment 6 Fedora Update System 2019-11-09 23:20:30 UTC
dhcp-4.4.1-18.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-8f16a6bfc6

Comment 7 Fedora Update System 2019-11-11 09:41:26 UTC
FEDORA-2019-a888aa6672 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2019-a888aa6672

Comment 8 Fedora Update System 2019-11-13 04:56:01 UTC
dhcp-4.4.1-19.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-a888aa6672

Comment 9 Tom Horsley 2019-11-13 12:50:27 UTC
Nope :-(. I now have installed:

dhcp-client-4.4.1-19.fc31.x86_64
dhcp-server-4.4.1-19.fc31.x86_64
dhcp-common-4.4.1-19.fc31.noarch

Doing a "systemctl stop dhcpd" still hangs, and I see this in the process list:

 2808 pts/0    S+     0:00                  \_ systemctl stop dhcpd
 2809 pts/0    S+     0:00                      \_ /usr/bin/systemd-tty-ask-password-agent --watch

No other services I've tried exhibit this behavior

Comment 10 Fedora Update System 2019-11-17 01:31:11 UTC
dhcp-4.4.1-19.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.

Comment 11 Pavel Zhukov 2019-11-17 08:27:23 UTC
Keep it opened as per c#9 for further investigation

Comment 12 Pavel Zhukov 2020-01-07 08:02:33 UTC
cann't reproduce so far:



# rpm -q dhcp-server
dhcp-server-4.4.1-20.fc32.x86_64


# journalctl  -u dhcpd
Jan 07 02:59:04 host-10-0-137-185 dhcpd[1464]: 
Jan 07 02:59:04 host-10-0-137-185 dhcpd[1464]: Sending on   Socket/fallback/fallback-net
Jan 07 02:59:04 host-10-0-137-185 dhcpd[1464]: Server starting service.
Jan 07 02:59:04 host-10-0-137-185 systemd[1]: Started DHCPv4 Server Daemon.
Jan 07 02:59:06 host-10-0-137-185 systemd[1]: Stopping DHCPv4 Server Daemon...
Jan 07 02:59:06 host-10-0-137-185 systemd[1]: dhcpd.service: Succeeded.
Jan 07 02:59:06 host-10-0-137-185 systemd[1]: Stopped DHCPv4 Server Daemon.

# systemctl start dhcpd 
# systemctl stop dhcpd 
# systemctl start dhcpd 
# time systemctl start dhcpd 

real	0m0.017s
user	0m0.002s
sys	0m0.011s

# time systemctl stop dhcpd 

real	0m0.031s
user	0m0.003s
sys	0m0.013s

Comment 13 Pavel Zhukov 2020-01-07 08:04:00 UTC
(In reply to Tom Horsley from comment #9)

>  2809 pts/0    S+     0:00                      \_
> /usr/bin/systemd-tty-ask-password-agent --watch
Are you running this command as root user?

Comment 14 Pavel Zhukov 2020-01-07 08:08:58 UTC
Can you run dhcpd manualy with "dhcpd -f" wait for "Sending on" message (better for some leases to be assigned) and kill with Ctrl-C. Will that work?

Comment 15 Tom Horsley 2020-01-07 13:39:19 UTC
I'm running "systemctl stop dhcpd" as root, something it does is running the ask password command.
Checking the user of the ask password command, I see it is running as root as well.

Trying dhcpd -f, I get this:

[root@tomh ~]# systemctl stop dhcpd
[root@tomh ~]# dhcpd -f
Internet Systems Consortium DHCP Server 4.4.1
Copyright 2004-2018 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host
yp_bind_client_create_v3: RPC: Unknown host
ldap_gssapi_principal is not set,GSSAPI Authentication for LDAP will not be used
Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the config file
Config file: /etc/dhcp/dhcpd.conf
Database file: /var/lib/dhcpd/dhcpd.leases
PID file: /var/run/dhcpd.pid
Source compiled to use binary-leases
Wrote 5 leases to leases file.

No subnet declaration for br0 (10.134.30.143).
** Ignoring requests on br0.  If this is not what
   you want, please write a subnet declaration
   in your dhcpd.conf file for the network segment
   to which interface br0 is attached. **

Listening on LPF/wlp0s29u1u2/b0:c7:45:a9:8b:35/192.168.9.0/24
Sending on   LPF/wlp0s29u1u2/b0:c7:45:a9:8b:35/192.168.9.0/24

No subnet declaration for p6p1 (no IPv4 addresses).
** Ignoring requests on p6p1.  If this is not what
   you want, please write a subnet declaration
   in your dhcpd.conf file for the network segment
   to which interface p6p1 is attached. **

Sending on   Socket/fallback/fallback-net
^C^\Quit
[root@tomh ~]# stty -a
speed 38400 baud; rows 84; columns 80; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R;
werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0;
-parenb -parodd -cmspar cs8 -hupcl -cstopb cread -clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff
-iuclc -ixany -imaxbel iutf8
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt
echoctl echoke -flusho -extproc
[root@tomh ~]# 

So ^C (intr) had no effect, but ^\ (quit) did kill it.

Comment 16 Pavel Zhukov 2020-01-07 16:28:54 UTC
(In reply to Tom Horsley from comment #15)

> So ^C (intr) had no effect, but ^\ (quit) did kill it.

First of all sorry for asking for this info (I can see you've posted all information I needed (c#4).
Thank you for providing logs again.

For some reasons your dhcpd server doesn't exit properly (either using SIGINT signal or using systemd mechanism) it looks weird as the bug with signal handling was fixed long time ago. I'll investigate this further.
In the meantime may I ask you for strace output?
0) install strace if not installed as well as debuging information for dhcp-server package.
1) run dhcpd -f as before
2) in different terminal execute " strace -k -yy -f -p `pidof dhcpd` "
3) in first terminal send SIGINT with Ctrl-C (better few times) and then SIGQUIT
4) post output of strace command here

Thank you in advance!

> I'm running "systemctl stop dhcpd" as root, something it does is running the ask password command.
This is actually non-relevant but confused me as well. By-design systemd spawns this command even if it's not needed. Just ignore it. [1]

[1] https://github.com/systemd/systemd/issues/9507

Comment 17 Pavel Zhukov 2020-01-07 16:36:25 UTC
As strace is not able to load dwarf of process it's being attached to there is small correction to the testing procedure below:

1) run dhcpd under strace (strace -k -f -yy dhcpd -f)
2) in different terminal execute kill command (kill -SIGINT `pidof dhcpd`; kill -SIGQUIT `pidof dhcpd) 
3) post outout from strace/dhcpd output

Comment 18 Tom Horsley 2020-01-07 19:03:30 UTC
Created attachment 1650494 [details]
strace of dhcpd -f

It sure looks like it saw the SIGINTs (I sent 3) but kept running.

I don't have any debuginfo files installed, so no details they might have provided. I can install them and run again if you need it.

Comment 19 Tom Horsley 2020-02-16 14:47:54 UTC
This looks like it belongs as a systemd bug, not a dhcp bug. It is getting worse. I just had the exact same kind of hang on a reboot trying to stop httpd. A ps showed the same systemd-tty-ask-password-agent sitting in the process tree where is was hanging trying to stop apache.

Comment 20 Pavel Zhukov 2020-04-03 09:46:35 UTC
Hello,

I haven't heard from you for a while. Did you manage to trace the issue? Is it dhcp issue or systemd one?

Comment 21 Tom Horsley 2020-04-03 13:10:13 UTC
No idea, I haven't seen the httpd hang again, but that made me think it is systemd. I've got a reboot command alias now that does stuff like kill -9 dhcpd before actually rebooting, so I'm just going with that hack.

Comment 22 Pavel Zhukov 2020-07-10 05:05:34 UTC
Ok. closing then. 
Feel free to reopen if the issue appears again.


Note You need to log in before you can comment on or make changes to this bug.