Bug 1100034 - Network route disappearing after suspend
Summary: Network route disappearing after suspend
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Lubomir Rintel
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-21 20:34 UTC by Aleksandar Kostadinov
Modified: 2015-10-19 15:18 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-19 15:17:34 UTC
Type: Bug


Attachments (Terms of Use)

Description Aleksandar Kostadinov 2014-05-21 20:34:36 UTC
Excuse me if component is wrong, but I've no idea what the right component here is.

Description of problem:
I have fedora 20 host and guest VM (3.14.4-200.fc20.x86_64). The problem is that after suspend, if for example some hours have passed while host laptop is suspended, then after wake up, the VM comes up with wrong network configuration.

How reproducible:
almost always when an hour or so have passed with laptop in suspend mode

Steps to Reproduce:
1. create a fedora 20 VM
2. close laptop lid to suspend fedora20 host
3. wait an hour or two
4. open lid to wake fedora20 host
5. ssh to VM
6. check network

Actual results:
> $ route -n
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
> 192.168.122.0   0.0.0.0         255.255.255.0   U     0      0        0 eth0
> $ systemctl status network.service -l
> network.service - LSB: Bring up/down networking
>    Loaded: loaded (/etc/rc.d/init.d/network)
>    Active: failed (Result: exit-code) since Wed 2014-05-21 23:12:33 EEST; 11min ago
>   Process: 5953 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE)
> 
> May 21 23:12:33 localhost.localdomain systemd[1]: Starting LSB: Bring up/down networking...
> May 21 23:12:33 localhost.localdomain network[5953]: Bringing up loopback interface:  [  OK  ]
> May 21 23:12:33 localhost.localdomain network[5953]: Bringing up interface eth0:  Error: No suitable device found: no device found for connection 'eth0'.
> May 21 23:12:33 localhost.localdomain network[5953]: [FAILED]
> May 21 23:12:33 localhost.localdomain systemd[1]: network.service: control process exited, code=exited status=1
> May 21 23:12:33 localhost.localdomain systemd[1]: Failed to start LSB: Bring up/down networking.
> May 21 23:12:33 localhost.localdomain systemd[1]: Unit network.service entered failed state.
> $ ifconfig 
> eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         inet 192.168.122.97  netmask 255.255.255.0  broadcast 192.168.122.255
>         inet6 fe80::5054:ff:fee3:23a2  prefixlen 64  scopeid 0x20<link>
>         ether 52:54:00:e3:23:a2  txqueuelen 1000  (Ethernet)
>         RX packets 84887  bytes 13286331 (12.6 MiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 62874  bytes 7104439 (6.7 MiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


Expected results:
> $ route -n
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
> 0.0.0.0         192.168.122.1   0.0.0.0         UG    0      0        0 eth0
> 192.168.122.0   0.0.0.0         255.255.255.0   U     0      0        0 eth0

Additional info:
If I manually add proper default route:
> $ sudo route add default gw 192.168.122.1

Then network starts working again. Still restarting network service does not work. A restart of the guest VM fixes the issue.
> $ sudo reboot
But still systemctl restart network fails.

Comment 1 Aleksandar Kostadinov 2014-05-30 08:01:44 UTC
I see actually that systemctl restart network does not work on my host machine also. Not only in the guest.

Comment 2 Aleksandar Kostadinov 2014-06-26 08:49:45 UTC
FYI I'm getting the issue after suspend now more consistently than before. Perhaps package upgrades have changed that. Any idea if this issue is to be fixed? To me not having "systemctl restart network" working is a pretty fundamental issue and deserved more attention. Not sure if/how it is related to the issues I see after suspend.

Comment 3 Aleksandar Kostadinov 2014-07-09 09:53:05 UTC
bump, any idea how to debug this losing default route on the VM after host suspend? Also sometimes machine hangs with spiky CPU usage as shown by virt-manager.

Comment 4 Cole Robinson 2014-07-09 14:31:46 UTC
I'm really not sure what to look for here... kernel guys, any ideas for additional debugging?

Comment 5 Neil Horman 2014-07-09 15:35:17 UTC
How do you specify the default route in your configuration?  Do you hard code it in a config file (or network manager)?  Or do you find the default route from your DHCP offer?  If you do the latter, my guess is that on suspend, network manager actually takes down your interface, erasing the default gateway via that device, and never restores it.  You can work around I imagine by adding a default gateway route in the networkmanager connections dialog, but the actual problem is likely to be fixed in NM

Comment 6 Aleksandar Kostadinov 2014-07-09 15:43:54 UTC
Neil, I'm using the default NAT network created by virt-manager. I never configured anything network by hand. To restore default route I found that `systemctl restart NetworkManager` helps. It's also interesting that IP/mask remains the same but only default route is lost.

This doesn't explain though why sometimes the VM is hung with high spiky CPU usage. Might be unrelated.

Comment 7 Cole Robinson 2014-09-08 13:01:21 UTC
Aleksandar, as you still seeing this with up to date f20 host and VM?

Comment 8 Aleksandar Kostadinov 2014-09-10 05:55:24 UTC
Yes, I just tried updating and default route is still lost.

Comment 9 Neil Horman 2014-09-10 12:37:48 UTC
Aleksandar, yes, that means you're using the default gateway provided in your DHCP offer, as I explained in comment 5.  That suggests that NetworkManager is where this needs to be fixed.  Until then, you can likely work around the problem by adding the default route in the NM connection dialog, so that NM knows to restore it on resume

Comment 10 Aleksandar Kostadinov 2014-09-10 13:37:24 UTC
So why NetworkManager on host is able to restore network when host is suspended but running on guest, it's unable to restore network? IMO there's some issue in the suspend/wake-up sequence. Maybe virtual network not up at the time NM does what it does or DHCP server doing something strange.

In any case, I hoped somebody more knowledgeable with these  techs could reproduce and pinpoint the culprit. I'm using t530 if that also plays any role.

Comment 11 Aleksandar Kostadinov 2014-09-12 05:44:09 UTC
btw it is not possible to override default route in a dhcp connection. See bug 1140947. I'll try static routing to see if it helps.

Comment 12 Aleksandar Kostadinov 2014-12-04 12:47:40 UTC
FYI static routing fixes the issue. Not ideal situation though.

Also today I experienced a "hang" today again. From the type where one CPU is maxed out. But went to get a breakfast and after an hour I saw it calmed down and everything working properly. Not sure what could it be, the VM is a clean fedora used for running test code so no strange services installed.
I am wondering if I can install some script checking CPU time every minute and log process cpu usage. If you have anything handy, I'd appreciate it as I've no good idea how to achieve that.

Comment 13 Fedora End Of Life 2015-05-29 11:55:18 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 14 Fedora Admin XMLRPC Client 2015-08-18 14:57:22 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 15 Dan Williams 2015-10-01 17:18:43 UTC
(In reply to Aleksandar Kostadinov from comment #10)
> So why NetworkManager on host is able to restore network when host is
> suspended but running on guest, it's unable to restore network? IMO there's
> some issue in the suspend/wake-up sequence. Maybe virtual network not up at
> the time NM does what it does or DHCP server doing something strange.
> 
> In any case, I hoped somebody more knowledgeable with these  techs could
> reproduce and pinpoint the culprit. I'm using t530 if that also plays any
> role.

Could you run "nmcli g log level debug" inside the guest, and then reproduce the issue, then attach "journalctl -b -u NetworkManager" output from the guest to this bug report?

Comment 16 Aleksandar Kostadinov 2015-10-19 15:17:34 UTC
It seems I cannot reproduce anymore. At least didn't occur after you requested info. Closing. Thank you for debugging tip. It can definitely come in handy.

Comment 17 Aleksandar Kostadinov 2015-10-19 15:18:25 UTC
FYI latest updates of fedora 22 guest under fedora 21 host.


Note You need to log in before you can comment on or make changes to this bug.