Bug 1162636
Summary: | bridged interface not coming up after suspend/resume | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jeff Layton <jlayton> |
Component: | NetworkManager | Assignee: | Dan Williams <dcbw> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 21 | CC: | awilliam, cra, dcbw, ernest.beinrohr, lrintel, luvilla, psimerda, soeren.grunewald |
Target Milestone: | --- | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | NetworkManager-1.0.2-1.fc22 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-05-11 19:04:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Jeff Layton
2014-11-11 11:57:56 UTC
Oh, and after resume if I run: $ nmcli d connect eno1 ...then the interface comes back. So I think the problem is that the slaved physical interface isn't coming back properly on resume? Same here, but on Fedora 20/64b. All connections are NM, even the bridge. Same issue seen here after upgrading to F21 (NetworkManager-0.9.10.0-14.git20140704.fc21.x86_64). This issue wasn't existing in F20 (NetworkManager-0.9.9.0-46.git20131003.fc20.x86_64). My bridge master conf is: $nmcli c show f9a8d75c-c4d9-459b-82df-4257c76d034f connection.id: Bridge connection 1 connection.uuid: f9a8d75c-c4d9-459b-82df-4257c76d034f connection.interface-name: br0 connection.type: bridge connection.autoconnect: yes connection.timestamp: 1420328674 connection.read-only: no connection.permissions: connection.zone: -- connection.master: -- connection.slave-type: -- connection.secondaries: connection.gateway-ping-timeout: 0 ... bridge.interface-name: br0 bridge.mac-address: -- bridge.stp: no bridge.priority: 128 bridge.forward-delay: 15 bridge.hello-time: 2 bridge.max-age: 20 bridge.ageing-time: 300 ... It's slave ethernet device is: $ nmcli c show a1dd88c4-31fa-4d19-bc58-3fdb3bf5229f connection.id: br0 slave 1 connection.uuid: a1dd88c4-31fa-4d19-bc58-3fdb3bf5229f connection.interface-name: em1 connection.type: 802-3-ethernet connection.autoconnect: yes connection.timestamp: 1420328674 connection.read-only: no connection.permissions: connection.zone: -- connection.master: f9a8d75c-c4d9-459b-82df-4257c76d034f connection.slave-type: bridge connection.secondaries: connection.gateway-ping-timeout: 0 802-3-ethernet.port: -- 802-3-ethernet.speed: 0 802-3-ethernet.duplex: -- 802-3-ethernet.auto-negotiate: yes 802-3-ethernet.mac-address: 54:EE:75:xx:yy:zz 802-3-ethernet.cloned-mac-address: -- 802-3-ethernet.mac-address-blacklist: 802-3-ethernet.mtu: auto ... Same here using NetworkManager-0.9.10.1-1.git20150105.fc21.x86_64 $ brctl show bridge name bridge id STP enabled interfaces bridge0 8000.00224d99c72f no em1 $ nmcli c show f424c59f-e02c-4766-a3be-af9663a3f05e connection.id: Bridge bridge0 connection.uuid: f424c59f-e02c-4766-a3be-af9663a3f05e connection.interface-name: bridge0 connection.type: bridge connection.autoconnect: yes connection.timestamp: 1420702851 connection.read-only: no connection.permissions: connection.zone: -- connection.master: -- connection.slave-type: -- connection.secondaries: connection.gateway-ping-timeout: 0 ipv4.method: auto ipv4.dns: ipv4.dns-search: ipv4.addresses: ipv4.routes: ipv4.ignore-auto-routes: no ipv4.ignore-auto-dns: no ipv4.dhcp-client-id: -- ipv4.dhcp-send-hostname: yes ipv4.dhcp-hostname: -- ipv4.never-default: no ipv4.may-fail: yes ipv6.method: ignore ipv6.dns: ipv6.dns-search: ipv6.addresses: ipv6.routes: ipv6.ignore-auto-routes: no ipv6.ignore-auto-dns: no ipv6.never-default: no ipv6.may-fail: yes ipv6.ip6-privacy: -1 (unknown) ipv6.dhcp-hostname: -- bridge.interface-name: bridge0 bridge.mac-address: -- bridge.stp: no bridge.priority: 32768 bridge.forward-delay: 15 bridge.hello-time: 2 bridge.max-age: 20 bridge.ageing-time: 300 ... $ nmcli c show c42c7c12-b7a0-4642-9d7f-0c2203d3f491 connection.id: em1 connection.uuid: c42c7c12-b7a0-4642-9d7f-0c2203d3f491 connection.interface-name: -- connection.type: 802-3-ethernet connection.autoconnect: yes connection.timestamp: 1420702851 connection.read-only: no connection.permissions: connection.zone: -- connection.master: bridge0 connection.slave-type: bridge connection.secondaries: connection.gateway-ping-timeout: 0 802-3-ethernet.port: -- 802-3-ethernet.speed: 0 802-3-ethernet.duplex: -- 802-3-ethernet.auto-negotiate: yes 802-3-ethernet.mac-address: 00:22:4D:99:C7:2F 802-3-ethernet.cloned-mac-address: -- 802-3-ethernet.mac-address-blacklist: 802-3-ethernet.mtu: auto 802-3-ethernet.s390-subchannels: 802-3-ethernet.s390-nettype: -- 802-3-ethernet.s390-options: $ sudo systemctl restart NetworkManager.service Solves the issue after resume I'm seeing this too and have been for some time, I'd reported it privately to dcbw but he's been too busy to fully look into it yet. Thanks for testing the latest version Soeren. I tested both 1.0 and 0.9.10 and couldn't easily reproduce, so there's something else going on here. Could somebody with this issue run: sudo nmcli g log level debug and then try to reproduce? Then grab 'journalctl -b -u NetworkManager' and lets see what's going on. Adam sent me a log from a while back but it doesn't appear detailed enough to debug what's really going on. To turn off debugging, "sudo nmcli g log level info". Alternatively, try this scratch build with extra debugging information: http://koji.fedoraproject.org/koji/taskinfo?taskID=8564355 and grab the output of 'journalctl -b -u NetworkManager' when the problem occurs. Thanks! Created attachment 977982 [details]
debug log from a suspend/resume cycle affected by the bug
I went for the "sudo nmcli g log level debug" option as I'm on Rawhide. I ran that command, then suspended at 17:29, resumed at 17:31, and did 'sudo nmcli con up "System em1"' - which brings the connection up, for me - at 17:33. Then I turned debugging off again. Here's the log extract covering that time.
Scratch build archived here in case it gets garbage collected: http://people.redhat.com/dcbw/NetworkManager/rh1162636/ Created attachment 978136 [details]
Another debug log for suspend/resume
1) Enable debug
2) suspend
3) resume
4) use "nmcli c up "Bridge brigde0""
5) use "nmcli c up em1"
It looks like the debug logs won't have all the information I need. Would anyone mind installing the RPMs linked above (which are the latest version in F21 + the debug patch) and reproduce the issue? Created attachment 978263 [details]
Debug log for suspend/resume with debug-patch (0.9.10.1-1.1.git20150105)
a) Install packages
$ sudo yum localinstall NetworkManager-0.9.10.1-1.1.git20150105.fc21.x86_64.rpm NetworkManager-config-connectivity-fedora-0.9.10.1-1.1.git20150105.fc21.x86_64.rpm NetworkManager-glib-0.9.10.1-1.1.git20150105.fc21.x86_64.rpm NetworkManager-wifi-0.9.10.1-1.1.git20150105.fc21.x86_64.rpm
$ rpm -qa | grep NetworkManager | sort
NetworkManager-0.9.10.1-1.1.git20150105.fc21.x86_64
NetworkManager-config-connectivity-fedora-0.9.10.1-1.1.git20150105.fc21.x86_64
NetworkManager-glib-0.9.10.1-1.1.git20150105.fc21.x86_64
NetworkManager-openvpn-0.9.9.0-3.git20140128.fc21.x86_64
NetworkManager-openvpn-gnome-0.9.9.0-3.git20140128.fc21.x86_64
NetworkManager-vpnc-0.9.9.0-6.git20140428.fc21.x86_64
NetworkManager-vpnc-gnome-0.9.9.0-6.git20140428.fc21.x86_64
NetworkManager-wifi-0.9.10.1-1.1.git20150105.fc21.x86_64
b) Restart NetworkManager
$ sudo systemctl restart NetworkManager
c) Do the test
1) sudo nmcli g log level debug
2) suspend
3) resume
4) nmcli c up em1
5) sudo nmcli g log level warn
Thanks, the issue is: NetworkManager[21238]: <info> (em1): device state change: unavailable -> disconnected (reason 'carrier-changed') [20 30 40] NetworkManager[21238]: <info> #### (em1): state 20 -> 30 (reason 40) NetworkManager[21238]: <info> #### (em1): processing DISCONNECTED state NetworkManager[21238]: <debug> [1420819973.802712] [nm-policy.c:1179] reset_autoconnect_all(): Re-enabling autoconnect for all connections on em1 NetworkManager[21238]: <info> #### (em1): scheduling activate check NetworkManager[21238]: <info> #### (em1): trying to schedule activation NetworkManager[21238]: <info> #### (em1): manager STATE_ASLEEP Koji build with candidate fix: http://koji.fedoraproject.org/koji/taskinfo?taskID=8575860 can you do a Rawhide build too? Thanks! Also uploaded here: http://people.redhat.com/dcbw/NetworkManager/rh1162636/ (release is -1.2) Rawhide koji scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=8575939 first try with the rawhide scratch build works, shipit! A quick test on F21 here seems to confirm the fix works. Thanks! It's a lot better, but I do seem to be running into a slightly different variant where I have to restart NetworkManager.service on resume to make the network come up, *sometimes* (maybe only after a long suspend, like overnight). I'll try and keep an eye on that and open a new bug. Works for me as well. Sorry for the delay on testing this, but I was travelling last week and didn't have access to the machine where I was seeing this. Yes, this package also seems to fix the problem for me. That said, I haven't tested it with a longer suspend cycle yet, so I can't confirm whether I've seen the same problem that Adam has. Dan: can you send the fix to Rawhide, at least, since it's now well-tested? -3 has since come out and superseded the scratch build... The fix seems to have been committed to NM upstream: http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=a687d1f9e0f75b987f40335934b54aa748f6724b I'm gonna go ahead and backport it to the Rawhide package at least, as it's annoying the pants off me. Dan, it'd be good if it could be backported to F21 too? oops, this is filed for f21, so I'll just leave it ASSIGNED for now. NetworkManager-1.0.2-1.fc22,network-manager-applet-1.0.2-1.fc22,NetworkManager-openconnect-1.0.2-1.fc22,NetworkManager-openvpn-1.0.2-1.fc22,NetworkManager-vpnc-1.0.2-1.fc22,NetworkManager-openswan-1.0.2-1.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/NetworkManager-1.0.2-1.fc22,network-manager-applet-1.0.2-1.fc22,NetworkManager-openconnect-1.0.2-1.fc22,NetworkManager-openvpn-1.0.2-1.fc22,NetworkManager-vpnc-1.0.2-1.fc22,NetworkManager-openswan-1.0.2-1.fc22 That this is fixed in f21, the patch is present in NetworkManager-0.9.10.2-2.fc21 Package NetworkManager-1.0.2-1.fc22, NetworkManager-openconnect-1.0.2-1.fc22, NetworkManager-vpnc-1.0.2-1.fc22, network-manager-applet-1.0.2-1.fc22, NetworkManager-openvpn-1.0.2-1.fc22, NetworkManager-openswan-1.0.2-1.fc22: * should fix your issue, * was pushed to the Fedora 22 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing NetworkManager-1.0.2-1.fc22 NetworkManager-openconnect-1.0.2-1.fc22 NetworkManager-vpnc-1.0.2-1.fc22 network-manager-applet-1.0.2-1.fc22 NetworkManager-openvpn-1.0.2-1.fc22 NetworkManager-openswan-1.0.2-1.fc22' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-7767/NetworkManager-1.0.2-1.fc22,network-manager-applet-1.0.2-1.fc22,NetworkManager-openconnect-1.0.2-1.fc22,NetworkManager-openvpn-1.0.2-1.fc22,NetworkManager-vpnc-1.0.2-1.fc22,NetworkManager-openswan-1.0.2-1.fc22 then log in and leave karma (feedback). NetworkManager-1.0.2-1.fc22, NetworkManager-openconnect-1.0.2-1.fc22, NetworkManager-vpnc-1.0.2-1.fc22, network-manager-applet-1.0.2-1.fc22, NetworkManager-openvpn-1.0.2-1.fc22, NetworkManager-openswan-1.0.2-1.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. |