Bug 1306944 - NetworkManager is stuck for a minute after resume before it finishes and fills the resolv.conf
NetworkManager is stuck for a minute after resume before it finishes and fill...
Status: CLOSED EOL
Product: Fedora
Classification: Fedora
Component: sendmail (Show other bugs)
22
x86_64 Linux
unspecified Severity low
: ---
: ---
Assigned To: Jaroslav Škarvada
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-12 04:09 EST by Adam Pribyl
Modified: 2016-07-19 16:16 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-07-19 16:16:27 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
network manager resume log (7.74 KB, text/plain)
2016-02-12 04:09 EST, Adam Pribyl
no flags Details
network manager resume debug log (127.18 KB, text/plain)
2016-02-16 04:15 EST, Adam Pribyl
no flags Details

  None (edit)
Description Adam Pribyl 2016-02-12 04:09:20 EST
Created attachment 1123424 [details]
network manager resume log

Description of problem:
With F22 the behavior of NetworkManager changed somehow and now after resume from suspend (hibernation) it takes more than a minute for NM to finish the initialization process. It seems IPs are assigned from DHCP(6) but it is waiting for something before it completes and fills e.g. resolv.conf, so networking is generally not available during that time.

Note: using nmapplet under IceWM.

Version-Release number of selected component (if applicable):
NetworkManager-1.0.10-2.fc22.x86_6

How reproducible:
Always at resume.

Steps to Reproduce:
1.Hibernate
2.Resume


Actual results:
It takes annoyingly long before networking is up and working.

Feb 12 09:38:46 desktop NetworkManager[643]: <info>  (enp5s0): DHCPv6 state changed unknown -> bound, event ID="".O#I"|1455266326"
Feb 12 09:39:40 desktop nm-dispatcher: Dispatching action 'pre-up' for enp5s0

Expected results:
Networking is available immediately after wakeup and response received from DHCP.

Additional info:
Log attached.
Comment 1 Beniamino Galvani 2016-02-15 03:28:43 EST
Hi,

can you please enable debug logging by adding level=DEBUG in the [logging] section of /etc/NetworkManager/NetworkManager.conf, restart NM, reproduce the issue and attach the relevant log? Thanks!
Comment 2 Adam Pribyl 2016-02-16 04:15 EST
Created attachment 1127542 [details]
network manager resume debug log

Not sure what NM is doing at that time - it is doing something IPv6 wise..

Corresponding "normal" log entries:
Feb 16 07:39:11 desktop NetworkManager[21884]: <info>  (enp5s0): DHCPv6 state changed unknown -> bound, event ID="".O#I"|1455604751"
...
Feb 16 07:40:05 desktop nm-dispatcher: Dispatching action 'pre-up' for enp5s0

it is receiving several times router advertisement. I am not sure what is going on here.
Comment 3 Beniamino Galvani 2016-02-16 04:33:16 EST
Feb 16 07:39:05 desktop nm-dispatcher: Running script '/etc/NetworkManager/dispatcher.d/10-sendmail'
[...]
Feb 16 07:40:05 desktop nm-dispatcher: Script '/etc/NetworkManager/dispatcher.d/10-sendmail' complete

The sendmail dispatcher script takes 1 minute to complete, delaying
the device activation. This sounds similar to [1], can you please try
to temporarily "chmod -x" it and see if the situation improves?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=838354#c2
Comment 4 Adam Pribyl 2016-02-18 04:41:21 EST
OK, thanks, this really seems to fix the problem. The reference bug seems to be closed as WONTFIX due to EOL. Should it be reopen? This dispatch script seems to be still a part of sendmail package.
Comment 5 Beniamino Galvani 2016-02-18 08:41:24 EST
(In reply to Adam Pribyl from comment #4)
> OK, thanks, this really seems to fix the problem. The reference bug seems to
> be closed as WONTFIX due to EOL. Should it be reopen? This dispatch script
> seems to be still a part of sendmail package.

Maybe first you can investigate why the script gets stuck for so long, for example by adding "set -x" at the top (below the first line) and looking at the traces in the journal.

And since that bug is quite old, probably you should file a new one.
Comment 6 Adam Pribyl 2016-02-18 09:23:36 EST
Well, there is hardly anything to debug:
#!/bin/sh

case "$2" in
        up|down|vpn-up|vpn-down)
               /bin/systemctl try-restart sendmail.service || :
                ;;
esac

I am not good at systemd debugging.

more sendmail.service
[Unit]
Description=Sendmail Mail Transport Agent
After=syslog.target network.target
...

does that really means the dispatch script is waiting for sendmail restart while it waits for network to come up?
Comment 7 Beniamino Galvani 2016-02-19 07:43:37 EST
(In reply to Adam Pribyl from comment #6)

> does that really means the dispatch script is waiting for sendmail restart
> while it waits for network to come up?

I don't know what sendmail is waiting for, but comment [1] suggests that it might be waiting for a DNS lookup to terminate.

Does any of the workarounds described in the comment work for you?

Anyway, this looks like a problem in the sendmail package, so I'm reassigning the bug.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=838354#c2
Comment 8 Jaroslav Škarvada 2016-02-19 08:29:11 EST
Yes, it waits (very probably) for the DNS lookup. So few questions:

Questions for NetworkManager maintainer:
- was anything related changed in NetworkManager? 
- is resolver functional when NetworkManager calls the dispatcher script with the 'up' parameter? If not, is there a way how to achieve this? Please note, that the dispatcher script was used for more than 6 years (the time I am maintaining the package) without problem.

Question for reporter:
- is your sendmail configuration functional, i.e. does the following command return immediately?
 /bin/systemctl try-restart sendmail.service
Comment 9 Adam Pribyl 2016-02-19 09:02:49 EST
I may try the "solution" from the #838354 - add my hostname to /etc/hosts, but the hostname is assigned from dhcp - at least I hope so.

In my opinion the resolver is not working at that time - at least I can not resolve anything during that time and /etc/resolv.conf is empty (there is some comment it is going to be filled by NM or something). 

My sendmail is pretty functional, can be restarted immediately. It is sending emails too, under normal conditions.
Comment 10 Jaroslav Škarvada 2016-02-19 09:17:12 EST
(In reply to Adam Pribyl from comment #9)
> I may try the "solution" from the #838354 - add my hostname to /etc/hosts,
> but the hostname is assigned from dhcp - at least I hope so.
> 
You need functional resolver with registered reverse DNS for your machine. That's what the /etc/hosts hack do, it fixes the missing reverse DNS. I think the hack may resolve the problem.

> In my opinion the resolver is not working at that time - at least I can not
> resolve anything during that time and /etc/resolv.conf is empty (there is
> some comment it is going to be filled by NM or something). 
> 
I think it should work (or at least I think it worked). I need to find out whether it is NetworkManager change and how it is supposed to work.

> My sendmail is pretty functional, can be restarted immediately. It is
> sending emails too, under normal conditions.
>
This implies the DNS records are correct.
Comment 11 Beniamino Galvani 2016-02-19 16:55:10 EST
(In reply to Jaroslav Škarvada from comment #8)
> - was anything related changed in NetworkManager?

As far as I know, nothing changed recently in the way dispatcher
scripts are called.

> - is resolver functional when NetworkManager calls the dispatcher script
> with the 'up' parameter? If not, is there a way how to achieve this? Please
> note, that the dispatcher script was used for more than 6 years (the time I
> am maintaining the package) without problem.

The 'up' event signals that the connection was activated and that the
interface IP layer was configured (if the user enabled IP
configuration). NM doesn't guarantee that the DNS server is reachable
after such event.

Moreover, in the log I see that the script blocks while handling the
'down' event:

  Feb 16 07:40:05 desktop nm-dispatcher: Script '/etc/NetworkManager/dispatcher.d/10-sendmail' complete
  [...]
  Feb 16 07:40:05 desktop nm-dispatcher: Dispatch 'down' on enp5s0 complete

I don't know how the service is implemented, but is it necessary to
restart the service when an interface goes down?
Comment 12 Fedora End Of Life 2016-07-19 16:16:27 EDT
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.